Tru64 UNIX
Ladebug Debugger Manual


Previous Contents Index


msgbuf.err                        /var/adm/crash/msgbuf.savecore 

If the msgbuf.err entry is not specified in the /etc/syslog.conf file, the msgbuf dump file is not saved. The msgbuf dump file cannot be forwarded to any system.
When the syslogd daemon is later initialized, it checks for the msgbuf dump file. If a msgbuf dump file is found, syslogd processes the file and then deletes it.

  • Creates the file binlogdumpfile.n in the /var/adm/crash directory. The variable n is determined by the value of the bounds file.

    You can modify the system default for the location of dump files by using the rcmgr command to specify another directory path for the /sbin/savecore utility:


    # /usr/sbin/rcmgr set SAVECORE_DIR </newpath> 
    

    The /sbin/init.d/savecore script invokes the /sbin/savecore utility.

    Crash Dump Files

    Crash dump files are either partial (the default) or full. The following sections describe each type of file and explains allocating the proper amount of space in the crash dump partition and file system.

    Partial Crash Dump Files

    Unlike full crash dumps, the size of a partial crash dump file is proportional to the amount of system activity at the time of the crash: the higher the level of system activity and the larger the amount of memory in use at the time of a crash, the larger the partial crash dump files will be. For example, when a system with 96 megabytes (MB) of memory crashes, it creates a vmcore.n file with 10 to 96 MB of memory (depending upon system activity) and a vmunix.n file with approximately six MB of memory.

    Note

    If you compress a core dump file from a partial crash dump, you must use care in decompressing it. Using the uncompress command with no options results in a core file equal to the size of memory. To ensure that the decompressed core file remains at its partial dump size, you need to use the uncompress command with the -c option and the dd command with the conv=sparse option. For example, to decompress a core file named vmunix.0.Z, enter the following command:


    # uncompress -c vmcore.0.Z | dd of=vmcore.0 conv=sparse 
    262144+0 records in 
    262144+0 records out 
    

    Full Crash Dump Files

    Full crash dump files can be very large because the vmunix.n file is a copy of the running kernel and the size of the vmcore.n file is slightly larger than the amount of physical memory on the system that crashed. For example, when a system with 96 MB of memory crashes, it creates a vmcore.n file with approximately 96 MB of memory and a vmunix.n file with approximately six MB of memory.

    Selecting a Crash Dump Type

    The default is to use partial crash dumps. If you want to use full dumps, you can modify the default behavior in the following ways:

    Determining Crash Dump Partition Size

    If you intend to save full crash dumps, you need to reserve disk space equal to the size of memory, plus one additional block for the dump header. For example, if your system has 128 MB of memory, you need a crash dump partition of at least 128 MB, plus one block (512 bytes).

    If you intend to save partial crash dumps, the size of the disk partition may vary, depending upon system activity. For example, for a system with 128 MB of memory, if peak system activity is low (never using more than 60 MB of memory), the size of the crash dump partition can be 60 MB. If peak system activity is high (using all of memory), 128 MB of disk space is needed.

    If full dumps are turned on and there is not enough disk space to create dump files for a full dump, partial dumps are automatically invoked.

    Determining File System Space for Saving Crash Dumps

    The size of the file system needed for saving crash dumps depends on the size and the number of crash dumps you want to retain. A general guideline is to reserve, at a minimum, the size of your crash dump partition, plus 10 MB. If necessary, you can increase this amount later.

    If your system cannot save a crash dump due to insufficient disk space, it returns to single user mode. This return to single user mode prevents system swapping from corrupting the dump file. Space can then be made available in the crash dump directory, or the changed directory, before continuing to multiuser mode. You can override this option using the following command:


    # /usr/sbin/rcmgr set SAVECORE_FLAGS M 
    

    This command causes the system to always boot to multiuser mode even if it cannot save a dump.

    Procedures for Creating Dumps of a Hung System

    If necessary, you can force the system to create dump files when the system hangs. The method for forcing crash dumps varies according to the hardware platform. The methods are described in the DEC OSF/1 Kernel Debugging manual.

    Guidelines for Examining Crash Dump Files

    In examining crash dump files, there is no one way to determine the cause of a system crash. However, the following guidelines should assist you in identifying the events that led to the crash:

    For more information, and for examples, see the DEC OSF/1 Kernel Debugging manual. This manual contains detailed information on the following topics related to crash dump analysis:

    Note

    Crash dump analysis is possible only with local, not remote, kernel debugging.

    17.3 Remote Kernel Debugging with the kdebug Debugger

    For remote kernel debugging, Ladebug is used in conjunction with the kdebug debugger, 1 which is a tool for executing, testing, and debugging test kernels. The kdebug code runs inside the kernel to be debugged on a test system, while Ladebug runs on a remote system and communicates with kdebug over a serial line or a gateway system.

    You use Ladebug commands to start and stop kernel execution, examine variable and register values, and perform other debugging tasks, just as you would when debugging user space programs. The kdebug debugger, not Ladebug, performs the actual reads and writes to registers, memory, and the image itself (for example, when breakpoints are set).

    Connections Needed

    The kernel code to be debugged runs on a test system. Ladebug runs on a remote build system and communicates with the kernel code over a serial communication line or through a gateway system.

    You use a gateway system when you cannot physically connect the test and build systems. The build system is connected to the gateway system by a network line. The gateway system is connected to the test system by a serial communication line.

    The following diagram shows the physical connection of the test and build systems (with no gateway):


      Build system          Serial line         Test system 
    (with Ladebug) <---------------------> (kernel code here) 
    

    The following diagram shows the connections when you use a gateway system:


      Build system       Network    Gateway      Serial line         Test system 
    (with Ladebug) <-----------> system <---------------------> (kernel code here) 
                                    with 
                                    kdebug 
                                    daemon 
    

    System Requirements

    The test, build, and (if used) gateway systems must meet the following requirements for kdebug:

    Getting Ready to Use the kdebug Debugger

    To use the kdebug debugger, first do the following:

    1. Attach the test system and the build system or test system and gateway system. See your hardware documentation for information about connecting systems to serial lines and networks.
    2. Configure the kernel to be debugged with the configuration file option OPTIONS KDEBUG. If you are debugging the installed kernel, you can do this by selecting KERNEL BREAKPOINT DEBUGGING from the kernel options menu.
    3. Recompile kernel files, if necessary. By default, the kernel is compiled with only partial debugging information, occasionally causing Ladebug to display erroneous arguments or mismatched source lines. To correct this, recompile selected source files specifying the CDEBUGOPTS=-g argument.
    4. Copy the kernel to be tested to /vmunix on the test system. Retain an exact copy of this image on the build system.
    5. Install the Product Authorization Key (PAK) for the Developer's kit (OSF-DEV), if it is not already installed. For information about installing PAKs, see the Installation Guide.
    6. Determine the debugger variable settings or command-line options you will use, as follows:

      Debugger variables:

      On the build system, add the following lines to your .dbxinit file if you need to override the default values (and you choose not to use the corresponding options, described below). Alternatively, you can use these lines within the debugger session, at the (ladebug) prompt:


                set $kdebug_host="gateway_system" 
                set $kdebug_line="serial_line" 
                set $kdebug_dbgtty="tty" 
      

      $kdebug_host specifies the node or address of the gateway system. By default, $kdebug_host is set to localhost, for when a gateway system is not used.
      $kdebug_line specifies the serial line to use as defined in the /etc/remote file of the build system (or the gateway system, if one is being used). By default, $kdebug_line is set to kdebug.
      $kdebug_dbgtty sets the terminal on the gateway system to display the communication between the build and test systems, which is useful in debugging your setup. To determine the terminal name to supply to the $kdebug_dbgtty variable, enter the tty command in the desired window on the gateway system. By default, $kdebug_dbgtty is null.

      Options:

      Instead of using debugger variables, you can specify any of the following options on the ladebug command line:

      • The -rn option specifies the node or address of the gateway system, and can be used instead of $kdebug_host.
      • The -line option specifies the serial line, and can be used instead of $kdebug_line.
      • The -tty option specifies the terminal name, and can be used instead of $kdebug_dbgtty.

      The above three options require the -remote option or its alternative, the -rp kdebug option.
      The variables you set in your .dbxinit file will override any options you use on the ladebug command line. In your debugging session, you can still override the .dbxinit variable settings by using the set command at the (ladebug) prompt, prior to issuing the run command.
    7. If you are debugging on an SMP system, set the lockmode system attribute to four, as shown:


      #  sysconfig -r lockmode = 4
      

      Setting this system attribute makes debugging on an SMP system easier.

    Invoking the Debugger

    When the setup is complete, start up the debugger as follows:

    1. Invoke the Ladebug debugger on the build system, supplying the pathname of the copy of the test kernel that resides on the build system. Set a breakpoint and start running Ladebug as follows (assuming that vmunix resides in the /usr/test directory):


      # ladebug -remote /usr/test/vmunix
      


         .
         .
         .
      


      (ladebug)  stop in hard_clock
      [2] stop in hard_clock
      (ladebug)  run
      

      Because Ctrl/C cannot be used as an interrupt, you should set at least one breakpoint if you wish the debugger to gain control of kernel execution. You can set a breakpoint anytime after the execution of the kdebug_bootstrap() routine. Setting a breakpoint prior to the execution of this routine can result in unpredictable behavior.

      Note

      Pressing Ctrl/C causes the remote debugger to exit, not interrupt as it does during local debugging.
    2. Halt the test system and, at the console prompt, set the boot_osflags console variable to contain the k option, and then boot the system. For example:


      >>>  set boot_osflags k
      >>>  boot
      

      Alternatively, you can enter:


      >>>  boot -A k
      

    Once you boot the kernel, it begins executing. The Ladebug debugger halts execution at the breakpoint you specified, and you can begin issuing Ladebug debugging commands. All Ladebug commands are available, except kps, attach, and detach. See Part 5, Command Reference for information on Ladebug debugging commands.)

    Breakpoint Behavior on SMP Systems

    If you set breakpoints in code that is executed on an SMP system, the breakpoints are handled serially. When a breakpoint is encountered on a particular CPU, the state of all the other processors in the system is saved and those processors spin, similarly to how execution stops when a simple lock is obtained on a particular CPU.

    When the breakpoint is dismissed (for example, because you entered a step or cont command to the debugger), processing resumes on all processors.

    Troubleshooting Tips

    If you have completed the kdebug setup and it fails to work, refer to the following list for help:

    Note

    1 Used alone, kdebug has its own syntax and commands, and allows local nonsymbolic debugging of a running kernel across a serial line. See the kdebug(8) manpage for information about kdebug local kernel debugging.


    Previous Next Contents Index