Cover V01, I03
Article
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Sidebar 1

sep92.tar


Getting the Info -- u386mon

Chris Hare

u386mon is a performance monitor that has been in the public domain for a number of years, through several revisions (the last one I have seen being u386mon2.20). This article tells you how to use u386mon to get and interpret system performance data.

u386mon is primarily for 386/486-based systems, but the author reports in the supplied README file that it has been ported to other environments. The distribution provides makefiles for several of the popular versions of Intel-based UNIX, including SCO and Interactive. It is not for XENIX-based systems as the XENIX kernels do not provide support for the nlist(3C) function, but use the xlist(S) function instead.

The u386mon Main Display

u386mon outputs detailed information about what is happening on the system in several different user displays. The display area supported is 25 or 43 lines, with or without color, and is split into three primary areas:

  • CPU status
  • System/Memory Information
  • Variable, Boot Information, Tunable Parameters, and Process Information

    The CPU section reports on instant, 5-second and 10-second CPU utilization. The information includes total, user, kernel, and break percentages and is displayed in both numerical and graphical form. In the graphical bar chart, user mode is represented by the letter "u" and kernel by a "k."

    When the CPU shows user mode, the kernel is executing application calls. Kernel mode indicates that a system call has been requested. Since a process in this state cannot be interrupted or killed, such a process will show a higher degree of kernel-mode CPU utilization.

    Most performance monitoring tools -- such as vmstat -- include CPU wait time with CPU utilization. There is a difference, however, and keeping track of the two separately helps you measure the true effectiveness of your CPU. If the CPU is spending a lot of time waiting for device interrupts or program alarms, then it is idle, and this can be significant if the CPU doesn't have anything else to do. When the WAIT time is merged with the actual KERNEL utilization time, the values reported are not realistic.

    u386mon separates the actual KERNEL and WAIT times so that you can see the amount of time the CPU is actually spending on processing instructions versus the time spent waiting for other activities.

    In the System/Memory Information (Sysinfo/Minfo) display area, shown in Figure 1, the majority of the kernel routines are displayed. This display can also be replaced with a process listing.

    u386mon Commands

    The following are the commands available once u386mon is running.

    ESCape -- quits u386mon

    e -- selects the extra display. If the display is in 43-line mode, this information will now be included in it.

    m -- selects the main Sysinfo/Minfo display.

    p -- shows a ps listing (does not include zombie, sleeping, shell, or getty processes).

    s -- shows serial I/O information for the standard serial devices which make up COM1 and COM2.

    +/- -- increments or decrements the update delay by one second. The initial startup value is 2 seconds, with the valid range being from 1 to 4 seconds.

    l -- Lock the u386mon process into RAM if you are running as the superuser (which you pretty well have to be in order to be sure that you can read the /dev/kmem device). When u386mon is locked into RAM, the worked PLOCK appears on the screen.

    Deciphering the Display

    Color is also used to communicate information. In the display shown in Figure 1, the CPU utilization values display in green if below 70 percent, in yellow if between 70 percent and 89 percent, and in red if over 90 percent.

    Similarly, in the CPU WAIT display, total wait time displays in green if it is less than 30 percent, in yellow if utilization is between 30 percent and 49 percent, and in red if it is 50 percent or above.

    If the word "INEXACT" appears on the screen, it indicates that u386mon was not scheduled quickly enough to collect accurate 1-second interval values. If this is the case, the 5- and 10-second intervals are also likely to be inaccurate.

    If the word "INVALID" appears, then u386mon was scheduled 3 or more seconds late, and all percentage values are now suspect.

    Sysinfo/Minfo Information

    The following information is reported on the Sysinfo/Minfo screen, and I thank Paul Hite, of PRC Realty Systems in Virginia, and Greg Oetting, of the Santa Cruz Operation's Support Department, for helping shed some additional light on these:

    bread -- Measures the physical block reads from the block devices, such as disks.

    bwrite -- Measures the physical block writes to the block devices, such as disks.

    lread -- Measures the logical disk reads from the buffer cache.

    lwrite -- Measures the logical disk writes to the buffer cache. (In the case of lread and lwrite, if the kernel wants a chunk of data, it will look in the buffer cache first, which will increment the value of lread. If the data is not found, then a physical block read will be done, thus incrementing bread.)

    phread -- Measures the blocks read with the kernel physio routine.

    phwrite -- Measures the blocks written with the kernel physio routine.

    swapin -- Counter which is incremented with each request to swap in a page from the swap device (a page is typically 4Kb in size).

    swapout -- Counter which is incremented with each request to swap out a page to the swap device.

    bswapin -- Measures the number of blocks read in from the swap device (a block is typically 512 bytes).

    bswapout -- Measures the number of blocks written out to the swap device. If the four swap values are high, (the term high being relative), this could indicate a potential RAM bottleneck, as the machine appears to be doing more swapping than would be desirable.

    iget -- Measures the number of calls made to get the inode information when given the inode number and the device number.

    namei -- Measures the number of calls made to convert a pathname into the associated inode information.

    dirblk -- Measures the number of blocks read for directory reads.

    readch -- Measures the number of characters read using the read() system call.

    writch -- Measures the number of characters written using the write() system call.

    rawch -- Counts raw tty characters read.

    canch -- Counts raw tty characters put onto canonical queue.

    outch -- Represents the number of characters that have been written out.

    msg -- Counts message operations (msgsnd()).

    sema -- Counts semaphore operations (semop()).

    maxmem -- Displays the amount of physical RAM available -- typically the total system RAM minus the RAM consumed by the kernel itself.

    freemem -- Displays the current amount of free RAM.

    mem used -- Shows the percentage of maxmem which has been allocated to processes. The higher this number, the more likely that pages will have been placed on swap.

    nswap -- Shows the total amount of swap area available.

    frswap -- Like freemem, represents the amount of unallocated swap space.

    swp used -- Like mem used, shows a percentage of the nswap which has been allocated. If this value and mem used are both high, then it is an indication that your machine may be spending CPU cycles thrashing between RAM and the swap area.

    pswitch -- Displays the number of process context switches. A context switch occurs when one process gives up the CPU and another process takes over. Higher values in this field indicate a sign of I/O intensive processes, because the kernel will switch to another process rather than wait for the I/O operation to be completed. Note that this is not switching from kernel to user mode, as this is a change in MODE, not in context.

    syscall -- Represents the number of system calls.

    sysread -- Represents the number of read() calls.

    syswrit -- Represents the number of write() calls.

    sysfork -- Represents the number of fork() calls.

    sysexec -- Represents the number of exec[,l,le,etc.]() calls.

    runque -- Shows the size of the run queue.

    runocc -- Shows the percentage of time that there is a job ready to run in the run queue.

    swpque -- Shows the size of the swap queue.

    swpocc -- Shows the percentage of time that there is a job in the swap queue.

    vfault -- Counts the number of page faults, which occur when a requested page is not in memory. When a page fault occurs, the system looks in all of the available RAM for that page (specifically from the free page pool); if the page is not found, then it is loaded in from the filesystem. Page faults occur only with program text, not data.

    demand -- Represents the number of pages created by calls to the malloc() system call. The pages are usually filled with byte value zero (0), hence the term demand zero pages.

    pfault -- Represents the number of page faults due to "copy on write." This type of error occurs when creating a new process via the fork() system call. Rather than copying all of the data pages for the forked program, we mark the data pages from the parent as "copy on write." Since many fork calls are followed with an exec system call, this increases the performance of the system.

    steal -- Represents the number of pages which have been stolen by paging (see sidebar).

    pnpfault -- Not used in SCO UNIX. Will always be zero.

    wrtfault -- Not used in SCO UNIX. Will always be zero.

    A number of other values are also shown on the main display. The majority of these values are not reported under SCO UNIX, and therefore will be zero.

    Extra Display Information

    The extra display information shown in Figure 2 is part of the main screen if you are running in 43-line mode. Be aware that owing to differences in porting, the bootinfo data provided differs between Interactive Systems and SCO. This information will not be available for systems which do not support it.

    The kernel parameters included in the Extra Display are defined as:

    v_autoup -- Specifies the age that a delayed buffer must be in seconds before bdflush will write it out.

    v_buf -- Reports the number of I/O buffers.

    v_clist -- Reports the number (NCLIST) of configured character list buffers.

    v_file -- Represents the configured maximum number (NFILE) of open files for the entire system.

    v_hbuf -- Reports the number of hash buffers allocated.

    v_inode -- Reports the size of the incore inode table, which is the number of active inodes system-wide.

    v_maxpmem -- Specifies the maximum amount of memory that can be consumed by a process; if the value is zero, then use all of the available memory.

    v_maxup -- Defines the maximum number (MAXUP) of processes that can be run by a non-root user.

    v_mount -- Defines the maximum number (NMOUNT) of mountable filesystems by specifying the size of the kernel mount table.

    v_pbuf -- Reports the number of allocated physical I/O buffers.

    v_proc -- Reports the size of the process table (NPROC), and, therefore, the maximum number of processes system-wide.

    v_region -- Specifies the size of the region table (NREGION). A region is a contiguous area of the virtual address space of a process that can be treated as shared or protected. Each process has a data, text, and stack region.

    BootInfo Data

    Boot information differs from vendor to vendor, but the display includes the following:

    basemem -- Reports the amount of base memory in the machine.

    extmem -- Reports the amount of extended memory in the machine.

    bootflags -- Reports some identifying information regarding the machine on which the system booted. On SCO systems the values are

    0x00000001 AT or AT386

    0x00000002 Microchannel

    0x00000004 Extended ISA

    0x00000008 Intel 80486

    The memory used and memory available fields are self-explanatory.

    A look at the u386mon source code reveals that much of the information available in this display is tuned for Interactive UNIX and not for SCO. As a result, the usefulness of the data is reduced in some circumstances.

    For ISC systems, for example, u386mon can in many cases identify the model of the machine on which it is running and display the video adapter in use. ISC configurations that it recognizes includes the following:

    CPU Types Monitor Types
    Compaq unknown to sys
    PS/2 EGA
    Generic 386 CGA
    AT&T 6386 MONO
    Olivetti M380 Compaq MONO
    Dell 386 Zenith Z449
    Dell 325 Toshiba T5100
    Adv Logic Res Compaq VGA
    Zenith Data VGA
    Paradise VGA1
    Video 7 VGA

    Tunable Parameters Data

    This section displays a number of the tunable parameters -- that is, parameters that can be adjusted in order to optimize the kernel resources.

    t_ageinterval -- The frequency at which processes are aged.

    t_bdflushr -- The rate at which bdflush is run.

    t_gpgshi -- The high-water mark for page stealing; the page stealing process will continue until the value of freemem is greater than this value.

    t_gpgslo -- A control value for freemem; if freemem falls below this value, stealing of pages from processes is triggered.

    t_gpgsmsk -- Mask used by the getpages routine in order to determine what pages can be stolen

    t_maxfc -- The maximum number of pages that will be saved up and freed at once.

    t_maxsc -- The maximum number of pages to be swapped out in a single operation.

    t_maxumem -- The maximum size of a user's virtual address space in pages.

    t_minarmem -- The minimum available resident, or non-swappable, memory that must be maintained in order to prevent deadlock.

    t_minasmem -- The minimum available swappable memory that must be maintained in order to prvent deadlock.

    Process Information

    The process section provides information on the state of current processes on the system. The display shows the count of current processes for each state.

    sleep -- Refers to a process currently waiting for an event to complete, such as disk I/O.

    run -- Denotes a process currently running.

    zombie -- Denotes a process which has terminated, but whose parent did not wait for it to complete. Such processes typically show in the process table as <defunct>; they are not a problem unless there are many of them.

    stop -- Denotes a process stopped by a debugger.

    idle -- Refers an intermediate state in process creation.

    onproc -- Denotes a process being run on a processor.

    xbrk -- Refers to a process that is being swapped.

    Serial I/O Information

    Figure 3 shows a sample of the output produced when the s option is specified. Unfortunately, non-standard serial device drivers are not included here. The current code in u386mon allows for a maximum of only 16 ports.

    The fields in the Serial I/O display are defined as:

    tty -- The name of the serial port.

    raw -- The number of characters which are to be processed on the raw input queue.

    can -- The number of characters to be processed on the raw canonical input queue.

    out -- The number of characters waiting on the output queue.

    speed -- The baud rate of the port.

    state -- The internal state of the port.

    iflag -- Input modes as defined by stty(C).

    oflag -- Output modes as defined by stty(C).

    cflag -- Control modes as defined by stty(C).

    lflag - Line discipline modes as defined by stty(C).

    pgrp -- The process group controlling the port.

    The state field consists of a series of characters representing the various modes the line can be in. The characters, and the modes they represent, are as follows:

    B -- Output is in progress on the port.

    C -- The software-copy of carrier detect is present.

    D -- A delay timeout is in progress on the port.

    O -- The device is open.

    S -- Output is stopped with Control-S.

    W -- The process is waiting for the open to complete.

    For more information on the input, output, and control modes, see stty(C) in the User's Reference Manual.

    Process Status Information

    When the p option is specified, u386mon retrieves a process list and displays it in place of the Sysinfo/Minfo data. Figure 4 shows a sample of this information, which encludes the following fields:

    S -- A two-character process status indicator. The first character defines the process status, the second defines the process swap status. The values for the first character in process status are:

    s sleeping

    R ready to run (might be running if u386mon were not)

    z zombie

    d stopped by debugger

    i idle

    p running on processor

    x XBREAK -- is growing or shrinking.

    An S following this character indicates that the process is swapped. Otherwise, the process is currently in memory.

    USER -- The username running the process; if the process is running setuid, then a # appears next to the user name.

    PID The process id.

    CPU -- CPU usage, used for scheduling purposes.

    UCPU -- User time for this process.

    SCPU -- System time for this process.

    SIZE -- Size of the swappable image in pages.

    TTY -- Terminal to which the process is currently attached.

    CMD -- The name of the command being executed.

    If there isn't enough space on the screen to display all of the processes, then some fields will be dropped on the basis of the following selective elimination scheme:

    1) getty, uugetty, sh, csh, ksh;

    2) swapped or zombie processes;

    3) sleeping processes.

    If there still isn't enough space, a message indicating this displays, and processing continues.

    Kernel Tuning Considerations

    Using u386mon to get the information is half the fun; interpreting it and using it is more often the challenge. u386mon can highlight potential configuration problems which may result in hardware adjustments or kernel tuning.

    While adjusting kernel parameters may seem the obvious solution to certain performance problems, there are tradeoffs involved, and, in fact, it is possible to degrade system performance by reconfiguring the kernel. This could occur, for example, if you set tunable parameters to a level that increases the amount of RAM needed for the kernel, thereby, decreasing the amount of RAM available for user processes. A decrease in the RAM vailable for user processes may mean the kernel will be forced to perform more page swapping, which, in turn, will decrease the amount of realtime available for user processing.

    Certain kernel tunables, however, may be considered strong candidates for adjustment, and there are mechanisms for evaluating whether or not adjustment is warranted.

    Candidate parameters -- along with methods for evaluating their appropriateness -- are listed here. The parameter names used are from the SCO UNIX System V/386 environment, and may not be the same on your system. For that reason, the definition of the parameter is also included.

    NFILE This parameter defines the maximum number of files which may be opened system-wide. When this table is full, an error similar to "file table overflow" will be displayed on the console device.

    To determine the appropriate value for NFILE, you must first find the maximum number of files that will be opened by any single application. To find that value, log in on two terminals and run pstat(C) or crash(ADM) (see Figure 5) on the first terminal. The value displayed on the first terminal when you log in on the second will be your BASE value. Now run each of your applications in turn, subtracting the BASE value from the open file value for the application. When you've found the largest number, use the formula shown in Figure 5 to calculate the value.

    NINODE This parameter defines the maximum number of inodes which may be active system-wide. When this table is full, an error similar to "inode table overflow" will display on the console device.

    Use the procedure described for NFILE to find the largest number of active inodes required by any single application, then use the formula shown in Figure 6 to calculate the required value. See Figure 6 also for the format required for pstat and crash.

    NPROC This parameter defines the maximum number of processes system-wide. When this table is full, an error message similar to "cannot fork: too many processes" displays on the console (this is not the same as the error that displays "too many processes" at the user's terminal -- see MAXUP below).

    Again, use the procedure described earlier to determine what the size of this table should be. Appropriate commands and the applicable formula are shown in Figure 7.

    NREGION There are typically three regions for each process running on the system. The specified value for NREGION should be three or four times the value of NPROC. To determine the current number of defined regions, use the subcommand region to the crash command.

    MAXUP This parameter defines the maximum number of processes that each non-root user can create. If an error message similar to "too many processes" displays at the user's terminal, then it is MAXUP which has been reached.

    Other Methods of Reporting on the Kernel Configuration

    Many UNIX systems provide the sysdef command which will read some kernel and system configuration files, and generate a report on the status of the system. Output from this command differs from system to systems (see Figure 8 for output from a Motorola 68020-based system and Figure 9 for a sample from an SCO UNIX 3.2.4-based system).

    Tricks, Tips, and Other Suggestions

    The process of tuning the kernel isn't meant to require a degree in the black arts -- or even guru-hood. Simply reading and adjusting parameters as required -- and interacting with the software vendors -- will do the trick in most situations. You should be aware that few operating systems are configured in such a way as to be suitable for all purposes "out of the box." Some tuning will be required for most systems.

    A wealth of books target this topic in general, but if you want more information on system tuning specifically, I suggest that you look at the sar (System Activity Reporter) which is distributed as part of UNIX, or get a copy of System Performance Tuning, published by O'Reilly and Associates.

    About the Author

    Chris Hare is the Operations Manager for i*internet Inc., a Canadian Internet Service provider. He has worked in the UNIX environment since 1986, and in 1988 became the first SCO Authorized Instructor in Canada. He is a co-auther of the book Inside UNIX, and he is currently focused on networking, security, and perl.


     



  •