Server common fault causes and troubleshooting

  
        

First, the main reason for the server can not start: Mains or power line failure (power failure or poor contact) Power or power module failure memory failure (generally accompanied by alarm sound) CPU failure (generally there will be an alarm sound ) Motherboard failure Other cards cause interrupt conflicts 2. The server cannot be started? Check if the power cable and various I/O wirings are connected properly. Check if the motherboard is powered after connecting the power cord. Set the server to the minimum configuration (only a single cpu, minimal memory, only connect the monitor and keyboard) directly short the motherboard switch jumper to see if it can be started. Check the power supply, unplug all the power connectors, and short the green and black wires of the power supply port of the power supply to see if the power is turned on. If you judge that the power is normal, you need to use the replacement method to troubleshoot. The replacement method is to replace the most easily replaced accessories (memory, cpu, motherboard) in the minimized configuration. 3. The system restarts frequently. Causes the system to restart frequently. : Power failure (replacement judgment) Memory failure (can be found from the BIOS error report) Network port data traffic is too large (work pressure is too large) Software failure (update or reinstall the operating system to solve) Fourth, server crash failure judgment Handling: The server crash is more difficult to judge. It is generally divided into two aspects: software and hardware: Software failure Hardware failure Software failure First check the operating system's system log, you can use the system log to determine the cause of the crash. The reason for the computer virus. A crash caused by a bug or vulnerability in the system software. This failure needs to be made after determining that the hardware is fault-free and requires assistance from the software provider. If the software is used improperly or the system is under too much pressure, you can ask the customer to reduce the working pressure of the server to see if it can solve the hardware failure. Hardware conflicts. Power failure or insufficient power supply can be calculated by comparing the values ​​of all load powers of the server power supply. Judge. Hard disk failure (by scanning the surface of the hard disk to check for bad sectors) Memory failure (can be judged by the error report in the BIOS of the motherboard and the error message of the operating system) Motherboard failure (using the replacement method to judge) CPU failure (using the replacement method) Card failure (usually SCSI/RAID card or other pci device may also cause the system to crash, can be judged by the replacement method) Note: The system crashes need to be tested after a certain period of time to perform a certain pressure after a certain period of time. Check if the fault is completely solved in one step. 5. When installing the operating system, I can't find the hard disk. Fault reason: No physical hard disk device hard disk cable connection problem. No hard disk controller driver or driver does not match. 6. How to get the driver? Use the random CD to make the corresponding driver. 7. Why? The hard drive controller driver still cannot be loaded with the correct driver? Check to see if the hostraid feature is enabled. Eight, a new purchase of a hard disk, after installation on the machine, the machine self-test can not pass? Remove the new hard disk, the machine can pass the self-test; check whether the newly added hard disk ID number and the original hard disk ID number The same, if the ID number of the hard disk is the same, the self-test will not pass. How to format SCSI hard disk? There are operating system conditions: formatted with disk management tools; no operating system: formatted in SCSI management control interface; take ADAPTEC Raid card as an example: boot - when CTRL+A information appears Press CTRL+A to enter - select channel A - select SCSI UTILITY - will detect the hard disk - select the hard disk to be detected - select FORMAT to fully format the hard disk. Select VERIFY to detect the hard disk and check if there are bad sectors. Can not interrupt or power off when formatting the hard disk, otherwise it will damage the disk. 10. There is a RAID card machine in the Aisino series. When one of the hard disks can not work normally, the RAID alarm, but the system can run normally, what should I do? Use a new hard disk to ensure If the capacity is greater than or equal to the hard disk that does not work properly, it is best to replace it with the same type of hard disk. Common Faults Related to RAID Cards The first type: The RAID card itself has problems. The RAID information is often lost. The hard disk is often dropped. You cannot do REBUILD. The hard disk is not detected or the time is long. Typical fault A: After completing RAID1 and installing the operating system, everything is normal, but when the system is restarted for the second time, an alarm sounds. After checking, a hard disk is dropped, and after REBUILD, it returns to normal, but it is dropped after restarting. Suspected that the hard disk is faulty, there is no problem after verifying the hard disk. Finally, replace the RAID card and solve the problem. Typical fault B: The machine often crashes and sometimes the startup speed is very slow. Observe the system log and find that there is an error message at system startup: device /devices/scsi/port0 did not respond within the time the transfer waited. After the RAID card is replaced, it returns to normal. The second type: The problem of the hard disk itself is that the hard disk is offline. The status in the RAID array is DEAD, or when doing REBUILD, the typical failure cannot be continued until a certain progress is made: After the hard disk is disconnected, when doing REBUILD, An error occurred at 20% and the message could not be continued. After confirming that the offline hard disk, hard disk box and SCSI cable can work normally, verify the online hard disk, find that there are bad sectors, repair the hard disk, redo REBUILD, and restore normal. The third category: the problem of the hard disk box or module contact This type of problem often shows that the RAID card can not detect the hard disk at all, such problems are relatively simple, but when dealing with the hard disk box related machines, you need to pay attention to some problems. Typical fault: The hard disk is not detected in the RIAD card. The SCSI cable is connected to the ULTRA160 interface of the motherboard. The fault persists. The hard disk box (excluding the bracket behind the hard disk box) is removed. The fault remains, and the hard disk is replaced. Finally, remove the bracket (non-hot-swappable part) on the back of the hard disk box, and find a pin on the 80PIN interface on the rear bracket to bend and straighten the looper to return to normal. XI. The SCSI hard disk used on the server, why can't the ID number of the hard disk be set to 7?


In the SCSI controller, ID=7 is set as the hard disk controller by default, so the ID number of the hard disk cannot be Set to 7-12, why can't the power-on self-test pass? Solution: The machine cuts off the power and opens the chassis. Use the jumper cap of the “COMS CLEAR” to jump the other two pins of the “COMS CLEAR” Connect (the jumper see the motherboard manual) The machine is powered, self-test, and other self-tests are completed, the CMOS has been cleared, then the machine power is turned off, the jumper can be restored, the machine can be restarted, and the physical memory is inserted. Solution for slot error: Power on - press F2 to enter SETUP”-“ADVANCED"--"MEMORY CONFIGURATION  Enter-"CLEAR DIMM ERRORS” Direct carriage return fourteen, why the processor is reporting error or self-test Only found one processor? Solution : Boot --> Press F2 to enter "SETUP" 1. In turn, "MAIN" --〉"PROCESSOR" --〉"CLEAR PROCESSOR ERRORS [ ]" : Set this option value to "YES" ; ; 2, followed by "ADVANCED " --〉"RESET CONFIGURATION DATA [ ] ": set the value of this option to " YES" ; 3, in turn "SERVER " --〉"PROCESSOR RESET [ ] ": Set the value of this option to " YES" ; 4, then "SERVER " --〉"SYSTEM MANAGEMENT ":Enter--> "CLEAR EVENTLOG [ ] " : Set the value of this option to " YES" 5. Press F10 to save and exit

Copyright © Windows knowledge All Rights Reserved