| Previous | Contents | Index |
msgbuf.err /var/adm/crash/msgbuf.savecore |
You can modify the system default for the location of dump files by using the rcmgr command to specify another directory path for the /sbin/savecore utility:
# /usr/sbin/rcmgr set SAVECORE_DIR </newpath> |
The /sbin/init.d/savecore script invokes the /sbin/savecore utility.
Crash Dump Files
Crash dump files are either partial (the default) or full. The following sections describe each type of file and explains allocating the proper amount of space in the crash dump partition and file system.
Partial Crash Dump Files
Unlike full crash dumps, the size of a partial crash dump file is proportional to the amount of system activity at the time of the crash: the higher the level of system activity and the larger the amount of memory in use at the time of a crash, the larger the partial crash dump files will be. For example, when a system with 96 megabytes (MB) of memory crashes, it creates a vmcore.n file with 10 to 96 MB of memory (depending upon system activity) and a vmunix.n file with approximately six MB of memory.
If you compress a core dump file from a partial crash dump, you must use care in decompressing it. Using the uncompress command with no options results in a core file equal to the size of memory. To ensure that the decompressed core file remains at its partial dump size, you need to use the uncompress command with the -c option and the dd command with the conv=sparse option. For example, to decompress a core file named vmunix.0.Z, enter the following command:
|
Full Crash Dump Files
Full crash dump files can be very large because the vmunix.n file is a copy of the running kernel and the size of the vmcore.n file is slightly larger than the amount of physical memory on the system that crashed. For example, when a system with 96 MB of memory crashes, it creates a vmcore.n file with approximately 96 MB of memory and a vmunix.n file with approximately six MB of memory.
Selecting a Crash Dump Type
The default is to use partial crash dumps. If you want to use full dumps, you can modify the default behavior in the following ways:
(ladebug) a partial_dump = 0 |
Determining Crash Dump Partition Size
If you intend to save full crash dumps, you need to reserve disk space equal to the size of memory, plus one additional block for the dump header. For example, if your system has 128 MB of memory, you need a crash dump partition of at least 128 MB, plus one block (512 bytes).
If you intend to save partial crash dumps, the size of the disk partition may vary, depending upon system activity. For example, for a system with 128 MB of memory, if peak system activity is low (never using more than 60 MB of memory), the size of the crash dump partition can be 60 MB. If peak system activity is high (using all of memory), 128 MB of disk space is needed.
If full dumps are turned on and there is not enough disk space to create dump files for a full dump, partial dumps are automatically invoked.
Determining File System Space for Saving Crash Dumps
The size of the file system needed for saving crash dumps depends on the size and the number of crash dumps you want to retain. A general guideline is to reserve, at a minimum, the size of your crash dump partition, plus 10 MB. If necessary, you can increase this amount later.
If your system cannot save a crash dump due to insufficient disk space, it returns to single user mode. This return to single user mode prevents system swapping from corrupting the dump file. Space can then be made available in the crash dump directory, or the changed directory, before continuing to multiuser mode. You can override this option using the following command:
# /usr/sbin/rcmgr set SAVECORE_FLAGS M |
This command causes the system to always boot to multiuser mode even if it cannot save a dump.
Procedures for Creating Dumps of a Hung System
If necessary, you can force the system to create dump files when the system hangs. The method for forcing crash dumps varies according to the hardware platform. The methods are described in the DEC OSF/1 Kernel Debugging manual.
Guidelines for Examining Crash Dump Files
In examining crash dump files, there is no one way to determine the cause of a system crash. However, the following guidelines should assist you in identifying the events that led to the crash:
For more information, and for examples, see the DEC OSF/1 Kernel Debugging manual. This manual contains detailed information on the following topics related to crash dump analysis:
Crash dump analysis is possible only with local, not remote, kernel debugging. |
For remote kernel debugging, Ladebug is used in conjunction with the kdebug debugger, 1 which is a tool for executing, testing, and debugging test kernels. The kdebug code runs inside the kernel to be debugged on a test system, while Ladebug runs on a remote system and communicates with kdebug over a serial line or a gateway system.
You use Ladebug commands to start and stop kernel execution, examine variable and register values, and perform other debugging tasks, just as you would when debugging user space programs. The kdebug debugger, not Ladebug, performs the actual reads and writes to registers, memory, and the image itself (for example, when breakpoints are set).
The kernel code to be debugged runs on a test system. Ladebug runs on a remote build system and communicates with the kernel code over a serial communication line or through a gateway system.
You use a gateway system when you cannot physically connect the test and build systems. The build system is connected to the gateway system by a network line. The gateway system is connected to the test system by a serial communication line.
The following diagram shows the physical connection of the test and build systems (with no gateway):
Build system Serial line Test system (with Ladebug) <---------------------> (kernel code here) |
The following diagram shows the connections when you use a gateway system:
Build system Network Gateway Serial line Test system
(with Ladebug) <-----------> system <---------------------> (kernel code here)
with
kdebug
daemon
|
The test, build, and (if used) gateway systems must meet the following requirements for kdebug:
Getting Ready to Use the kdebug Debugger
To use the kdebug debugger, first do the following:
Debugger variables:
On the build system, add the following lines to your .dbxinit file if you need to override the
default values (and you choose not to use the corresponding options,
described below). Alternatively, you can use these lines within the
debugger session, at the (ladebug) prompt:
set $kdebug_host="gateway_system"
set $kdebug_line="serial_line"
set $kdebug_dbgtty="tty"
|
Options:
Instead of using debugger variables, you can specify any of the
following options on the ladebug command
line:
# sysconfig -r lockmode = 4 |
Invoking the Debugger
When the setup is complete, start up the debugger as follows:
# ladebug -remote /usr/test/vmunix |
. . . |
(ladebug) stop in hard_clock [2] stop in hard_clock (ladebug) run |
Pressing Ctrl/C causes the remote debugger to exit, not interrupt as it does during local debugging. |
>>> set boot_osflags k >>> boot |
>>> boot -A k |
Once you boot the kernel, it begins executing. The Ladebug debugger halts execution at the breakpoint you specified, and you can begin issuing Ladebug debugging commands. All Ladebug commands are available, except kps, attach, and detach. See Part 5, Command Reference for information on Ladebug debugging commands.)
Breakpoint Behavior on SMP Systems
If you set breakpoints in code that is executed on an SMP system, the breakpoints are handled serially. When a breakpoint is encountered on a particular CPU, the state of all the other processors in the system is saved and those processors spin, similarly to how execution stops when a simple lock is obtained on a particular CPU.
When the breakpoint is dismissed (for example, because you entered a step or cont command to the debugger), processing resumes on all processors.
If you have completed the kdebug setup and it fails to work, refer to the following list for help:
# tip kdebug
|
# ps agxt00
|
# ps agx | grep kdebugd
|
>>> set boot_osflags k
>>> boot
|
# /usr/bin/tty /dev/ttyp2 |
set $kdebug_dbgtty="/dev/ttyp2"
|
set $kdebug_host="decosf"
|
set $kdebug_line=""
|
1 Used alone, kdebug has its own syntax and commands, and allows local nonsymbolic debugging of a running kernel across a serial line. See the kdebug(8) manpage for information about kdebug local kernel debugging. |
| Previous | Next | Contents | Index |