Linux operating system crash processing method summary

  
                          Usually after a system crash, everyone will worry about the failure again, but found that the system does not record any information before and after the crash, can not analyze the cause of the failure, that there is no medicine to save. However, in fact, Linux has a variety of mechanisms to ensure that after a system crash, valuable information can be obtained to analyze the problem. Determine if it is a hardware failure or an application bug. In Linux, there are several ways to get information about various crashes. 1.Core dump
Core dump is usually used to debug application errors. When some applications run abnormally, you can turn on the core dump function of the system to get the memory information when the program crashes. To analyze the cause of the crash: Add (or modify) one in /etc/profile: ulimit -c 0 Run the command: sysctl -w "kernel.core_name_format=/coredump/%n.core" This command means that the core file is placed In the /coredump directory, the file name is the process name +.core 2.Diskdump
The diskdump tool provides the ability to create and collect vmcore (kernel dump) on a single machine without using the network. When the kernel itself crashes, the current memory and CPU state and related information are saved to a reserved partition on a disk that supports diskdump. At the next reboot, when the system reboots, the diskdump initialization script reads the saved information from the reserved partition and creates a vcore file, which is then stored again in the /var/crash/directory. 127.0.0.1- The following is a procedure for enabling diskdump on an HP SCSI device. If it is not an HP SCSI device (that is, the device name is in the form of /dev/sdX), there is no need to perform the third and fourth steps. But you need to execute the command before the first step: modprobe diskdump Step 1: Edit the /etc/sysconfig/diskdump file, fill in the device name of a blank partition and save and exit, for example: DEVICE=/dev/cciss/c0d0p2 Two steps: Initializing the dump device #service diskdump initialformat Warning: The data of this partition will be lost. Step 3: Replace the current cciss module with the cciss_dump module: Find the following line in /etc/modprobe.conf: alias scsi_hostadapter cciss Modify to: alias scsi_hostadapter cciss_dump Add another line: options cciss_dump dump_drive=1 Note: Assume that the diskdump file is configured For /dev/cciss/c0d[#a]p[#b], set it to: options cciss_dump dump_drive=[#a] Step 4: Rebuild the initrd file: #mv /boot/initrd-`uname -r`. Img /boot/initrd-`uname -r`.img.old #mkinitrd /boot/initrd-`uname -r`.img `uname -r` Step 5: Set the diskdump service to boot from boot: # chkconfig diskdump on
3.Netdump
If you use Red Flag DC4.0 or 3.0 system, you can't support diskdump. You can use netdump to achieve the purpose of output vmcore. But Netdump requires at least one server and any number of clients. The server is used to receive information when the client crashes, and the client is a machine that often crashes. (1) Server configuration: (1). Verify that the netdump server is installed: rpm -q netdump-server If it is not installed, find the package that starts with netdump-server in the RedFlag/RPMS/directory of the CD and execute the command: rpm - Ivh netdump-server-xxxrpm (x is the version number) to install. (2). After the server package is installed, use the command: passwd netdump to change the user's password. (3). Open the service: chkconfig netdump-server on (4). Run the server: service netdump-server start (2) Client configuration: (1). Verify that the client has rpm installed -q netdump If it is not installed, find the netdum package in the RedFlag/RPMS/directory of the CD and execute the command: rpm -ivh netdump-xxxrpm (x is the version number) Installation. (2) Edit the file /etc/sysconfig/netdump and add the following line: DEV=eth0 NETDUMPADDR=172.16.81.182 NETDUMPMACADDR=00:0C:29:79:F4:E0 172.16.81.182 refers to the netdump server address. (3). Run the following command, enter the password when prompted: service netdump propagate (4). Open the client: chkconfig netdump on (5). Run the client: service netdump start (6). Test in order to test netdump The configuration is correct. Do the following on the netdump client: cp /usr/share/doc/netdump-xxxxxx/crash.c . gcc -DKERNEL -DMODULE -I/lib/modules/$(uname -r)/build/include -c crash.c insmod ./crash.o This will cause the system to crash. You will see a core dump in the /var/crash/<client IP>/directory of the netdump server. When the client is dumping data to the server, you will see a file called "vmcore-incomplete". When the dump is finished, the file will be renamed to "vmcore". The size of the "vmcore" file will vary, possibly up to a few gigabytes. On a system with 512M of memory, the above test will generate a vmcore file of approximately 510M.

Copyright © Windows knowledge All Rights Reserved