Tru64 UNIX Collect

»

HP Tru64 UNIX

Tru64 UNIX

» Tru64 UNIX V5.1B-6
» Tru64 UNIX V5.1B-5
» Documentation
» Information library
» Software web index
» Software products library
» Patch database
» Services
» Developer & Solution Partner Program
» Send us your comments
» Support Statements

Evolving business value

» Tru64 UNIX to HP-UX 11i transition benefits calculator
» Alpha RetainTrust Program
» Transition

Related links

» Alpha systems
» HP-UX 11i
» Integrity servers
» Linux
» HP storage
» HP solutions
HP-UX 11i: measurably better TCO!

» Overview

» Features

Product details

» Resources

Description

The Collect utility is a system monitoring tool that records or displays specific operating system and process data for a set of subsystems. Any set of the subsystems, such as File systems, message Queue, ttY, or Header can be included in or excluded from data collection. Data can either be displayed back to the terminal or stored in either a compressed or uncompressed data file. Data files can be read and manipulated from the command line or through use of command scripts.

To ensure that the Collect utility delivers reliable statistics, it locks itself into memory using the page locking function plock(), and by default cannot be swapped out by the system. It also raises its priority using the priority function nice(). However, these measures should not have any impact on a system under normal load, and they should have only a minimal impact on a system under extremely high load. If required, page locking can be disabled using the -ol command option and the Collect utility's priority setting can be disabled using the -on command option.

Some Collect operations use kernel data that is only accessible to root. System administration practice should not involve lengthy operations as root, therefore Collect is installed with permissions set as 04750. This setting allows group (typically system) members to run Collect with owner setuid permissions. If this is inappropriate in your environment, you may reset permissions to fit your needs.

Automatic starting on reboot

You can configure Collect to automatically start when the system is rebooted. This is particularly useful for continuous monitoring. To do this, use the rcmgr command with the set operation to configure the following values in /etc/rc.config_:

cariad >rcmgr set COLLECT_AUTORUN 1

A value of 1 sets Collect to automatically start on reboot. A value of 0 (the default) causes Collect to not start on reboot.

cariad >rcmgr set COLLECT_ARGS ""

A null value causes Collect to start with the default values (command options) of:

-i60,120 -f /var/adm/collect.dated/collect -H d0:5,1w

You may select other values.

cariad >rcmgr set COLLECT_COMPRESSION 1

A value of 1 sets compression on. A value of 0 sets compression off.

See the rcmgr(8) reference page for more information.

Playing back multiple data files

The Collect utility can read multiple binary data files using the -p option and play them back as one stream, with monotonically increasing sample numbers. It is also possible to combine multiple binary input files into one binary output file, by using the -p option with the input files and the -f option with the output file. Note that the Collect utility will combine input files in whatever order you specify on the command line. This means that the input files must be in strict chronological order if you want to do further processing of the combined output file. You can also combine binary input files from different systems, made at different times, with differing subsets of subsystems for which data has been collected. Filtering options such as -e, -s, -P, and -D can be used with this function.

Normalization of data

Where appropriate, data is presented in units per second. For example, disk data such as kilobytes transferred, or the number of transfers, is always normalized for one second. This happens no matter what time interval is chosen. The same is true for the following data items:

  • CPU interrupts, system calls and context switches
  • memory pages out, pages in, pages zeroed, pages reactivated, and pages copied on write
  • network packets in, packets out, and collisions
  • process user and system time consumed

Other data is recorded as a snapshot value. Examples of this are: free memory pages, CPU states, disk queue lengths, and process memory.

Data collection interval

A collection interval can be specified using the -i option followed by an integer, optionally followed (without spaces) by a comma or colon and another integer. If the optional second integer is given, this is a separate time interval which applies only to the process subsystem. The process interval must be a multiple of the regular interval. Collecting process information is more taxing on system resources than are the other subsystems and is not generally needed at the same frequency. Process data also takes up the most space in the binary data-file. Generally, specifying a process interval greater than 1 will significantly decrease the load the collector places on the system being monitored.

Specifying what data to collect

Use the -S (sort) and -nX (number) options to sort data by percentage of CPU usage and to save only X processes. Target specific processes using the -Plist option, where list is a list of process identifiers, comma-separated without blanks.

If there are many (greater than 100) disks connected to the system being monitored, use the -D option to monitor a particular set of disks.

Data compression

The Collect utility reads and writes gnuzip format compressed datafiles. Compressed output is enabled by default but can be disabled using the -oz command option. The extension .cgz is appended to the output filename, unless the -oz option is specified. Older, uncompressed datafiles can be compressed using gzip, and the resulting files can be read by Collect in their compressed form.

Compression during collection should not generate any additional CPU load. Because compression uses buffers and therefore does not write to disk after every sample, it makes fewer system calls and its overall impact is negligible. However, because the output is buffered there is one possible drawback. If Collect terminates abnormally (perhaps due to a system crash) more data samples will be lost than if compression is not used. This should not be an important consideration for most users, as you can specify how often data should be written to disk.

Specifying a time range from a playback file

You can select samples from the total period of the time that data collection ran. Use the -C option to specify a start time, and optionally, an end time. The format is as follows:

[+]Year:Month:Day:Hour:Minute:Second.

The plus sign (+) indicates that the time should be interpreted as relative to the beginning of the collection period. If any of the fields are excluded from the string, the corresponding values from the start time are used in their place as the time-value is parsed from right to left. Thus, field one is interpreted as Second, field two (if there is one), as Minute, and so on. For example, if the collection period is from October 21, 1999, 16:44:03 to October 21, 1999, 16:54:55, all but minutes and seconds can be ommitted from the command option: -C46:00,47:00 (from 16:46:00 to 16:47:00). However, if the collection ran overnight, it is necessary to specify the day as well. For example, if the period were Oct 21 16:44 to Oct 22 9:30, to specify a period from 23:00 to 1:00, you must enter the following:

# -C 21:23:00:00,22:1:00:00

General command options

The following command options are useful:

If you want simultaneous text (ascii) output to the screen while collecting to a file, use the -a option.

The -t option prefixes each data line with a unique tag. This makes it easier for your scripts to find and to extract data. Tags are superfluous if you use the perl script cfilt.

The -T option shuts off collection for all subsystems except disk, and only displays a total MB/sec across all disks in the system. Use the -s option with the -T option to override this behavior and collect data for other subsystems.

The -R option causes Collect to terminate after a specified amount of time.

All flags that can reasonably be applied to both collection and play-back will work. The -Plist filter option used during collection will collect data only for the processes you specify. During playback it will only display data for the corresponding processes. To save space in the binary data file, you can limit your collection to specific processes, specific disks, or specific subsystems. However, if you want to look at volumes of data and select different chunks at a time, you should collect everything and later use the filter options to select data items during playback.

Disk statistics

Note that under certain circumstances the Disk Statistics may be only approximate. Providing you use the latest Collect versions and operating system patches, data is presented for all statistics except %BSY, which is zero. In this release, ACTQ and WTQ are absolutely accurate. For older releases of Collect, some data fields were zero and data in some fields could be inaccurate under certain circumstances.

Data conversion and filtering

In this release, Collect automatically reads older datafile versions when playing back files.

You can convert an older Collect version datafile to the current version using the -p collect_datafile option with the -f file. During conversion you can use most command options to extract specific data from the input collect_datafile. For example:

  • Use the -s and -e options to select data only from particular subsystems.
  • Use the -nX and -S options to take only X processes and sort them by CPU usage.
  • Use the -D option to select disks and the -L option to select LSM volumes.
  • Use the -P, -PC, -PU, -PP options to select processes based on their identifiers.
  • Use the -C option to extract data according to specified start and stop times.

More Collect product details

» Options
» Data fields
» Examples
» Product availability
» Frequently asked questions