Fourteen common faults and analysis of the server

  

First, the main reason for the server can not start:

Mains or power line failure (power failure or poor contact)

Power or Power module failure

Memory failure (generally accompanied by alarm sound)

CPU failure (generally there will be alarm sound)

Mainboard failure

Others The card caused an interrupt conflict

Second, the server could not be started?

Check that the power cord and various I/O wiring are properly connected.

Check if the motherboard is powered after connecting the power cord.

Set the server to the minimum configuration (only a single cpu, the least memory, only connect the monitor and keyboard) directly short the motherboard switch jumper to see if it can be started.

Check the power supply, unplug all the power connectors, and short the green and black wires of the power supply port of the power supply to see if the power is turned on.

If you judge that the power is normal, you need to use the replacement method to troubleshoot. The replacement method is to replace the most easily replaced accessories in the minimized configuration (memory, cpu, motherboard)

Third, the system restarts frequently?

The cause of frequent system restart:

Power failure (replacement judgment)

Memory failure (can be found from BIOS error report)

Network port data traffic is too large (work pressure is too large)

Software failure (update or reinstall the operating system to solve)

Fourth, server crash failure judgment processing:

Server crash is more difficult to judge, generally divided into software and hardware:

Software failure

Hardware failure

Software failure

Check operation first The system log of the system can be used to determine the cause of the crash caused by the system log.

The cause of computer viruses.

A crash caused by a bug or vulnerability in the system software. This failure needs to be made after determining that the hardware is not faulty, and it needs help from the software provider.

The software is not used properly or the system is under too much pressure. You can ask the customer to reduce the working pressure of the server to see if it can solve the problem.

Hardware failure

Hardware conflict

The power failure or insufficient power supply can be judged by comparing the values ​​of all the load powers of the server power supply.

Hardware failure (by scanning the surface of the hard disk to check for bad sectors)

Memory failure (can be judged by the error report in the BIOS of the motherboard and the error message of the operating system)

Board failure (using the replacement method to judge)

CPU failure (using replacement method)

Board failure (generally SCSI/RAID card or other pci device may also cause system crash, Can be judged by the replacement method)

Note: The system crashes need to be tested after a certain period of time to perform a certain pressure on the machine to check whether the fault is completely solved.

Fifth, suggested that no hard disk when installing the operating system?

Fault reason:

No physical hard disk device

Hard cable connection problem

No hard disk controller driver installed or driver does not match

Six, how to get the driver?

Use the CD-ROM to make the corresponding driver

7. Why can't I still load the hard disk controller driver with the correct driver?

Check if hostraid is enabled.

Eight, a new purchase of a hard disk, after the installation on the machine, the machine self-test can not pass?

Remove the new hard disk, whether the machine can pass the self-test;

Check if the ID number of the newly added hard disk is the same as the ID number of the original hard disk, if the ID number of the hard disk is the same If you do, the self-test will not pass.

Nine, how to format SCSI hard disk?

In the case of an operating system: Formatted using a disk management tool;

No operating system: Formatted on the SCSI management control interface;

With the ADAPTEC Raid card Example: Power On - When CTRL+A message appears, press CTRL+A to enter - select channel A

-check SCSI UTILITY-will detect hard disk - check hard disk to be detected

-select FORMAT Fully format the hard disk

Select VERIFY to check the hard disk and check for bad sectors

Note: You cannot interrupt or power off the hard disk when formatting the disk, otherwise the disk will be damaged

10. There are RAID card machines in the Aisino series. When one of the hard disks cannot work normally with RAID alarm, but the system can run normally, what should I do?

Use a new hard disk to ensure that the capacity is greater than or equal to the hard disk that does not work properly. It is best to replace it with the same type of hard disk.

Frequently related to RAID card

The first type: RAID card itself has problems

often shows that RAID information is lost, the hard disk is often dropped, can not do REBUILD, boot from The hard disk is not detected or the time is long.

Typical fault A:

After RAID1, the operating system is installed, everything is normal, but when the system is restarted for the second time, an alarm sounds. After checking, a hard disk is dropped, after REBUILD, It returned to normal, but it was dropped after restarting. Suspected that the hard disk is faulty, there is no problem after verifying the hard disk. Finally, replace the RAID card and solve the problem.

Typical failure B:

The machine often crashes, and sometimes the startup speed is very slow. Observe the system log and find that there is an error message at system startup: device /devices/scsi/port0 did not respond within the time the transfer waited. After the RAID card is replaced, it returns to normal.

Second type: The problem of the hard disk itself

is that the hard disk is dropped, the state in the RAID array is DEAD, or when making REBUILD, it can not continue until a certain progress. >

Typical failure:

After the hard disk is disconnected, when doing REBUILD, an error message is displayed when it is 20%. After confirming that the offline hard disk, hard disk box and SCSI cable can work normally, verify the online hard disk, find that there are bad sectors, repair the hard disk, redo REBUILD, and restore normal.

The third category: Hard disk box or module contact problem

This type of problem often manifests as the RAID card can not detect the hard disk at all, such problems are relatively simple, but in dealing with the hard disk box related When you are on the machine, you need to pay attention to some problems.

Typical failure:

The RIAD card cannot detect the hard disk, and the SCSI cable is connected to the ULTRA160 interface of the motherboard. The fault remains, and the hard disk box is removed (excluding the bracket behind the hard disk box). ) Replace, the fault remains, replace the hard drive, or not. Finally, remove the bracket (non-hot-swappable part) on the back of the hard disk box, and find a pin on the 80PIN interface on the rear bracket to bend and straighten the looper to return to normal.

11. The SCSI hard disk used on the server, why can't the ID number of the hard disk be set to 7?

In the SCSI controller, ID=7 is set as the hard disk controller by default, so the ID number of the hard disk cannot be set to 7

12. Why can't the boot self-test pass?

Workaround:

The machine cuts off the power, opens the chassis, and shorts the other two pins of the "COMS CLEAR" jumper with the jumper cap of the "COMS CLEAR" jumper. Line refer to the motherboard manual)

Power up the machine, self-test, wait for the machine to self-test, close the CMOS has been cleared, then turn off the power of the machine, restore the jumper

Machine reboot

13. Physical memory slot error

Workaround:

Power on - press F2 to enter "SETUP" - "ADVANCED" - "MEMORY CONFIGURATION" Enter - "CLEAR DIMM ERRORS" Direct carriage return

XIV. Why does the processor only report one processor during the error or self-test?

Workaround:

Power On-> Press F2 to enter "SETUP"

1, then "MAIN" -> "PROCESSOR" -> "CLEAR PROCESSOR ERRORS [ ]" : Set this option value to "YES";

2, then "ADVANCED" -> "RESET CONFIGURATION DATA [ ] ": Set the value of this option to "YES";

3, in order "SERVER" -> "PROCESSOR RESET [ ] ": set the value of this option to "YES";

4, in turn "SERVER" -> "SYSTEM MANAGEMENT ":Enter -> "CLEAR EVENTLOG [ ] " : Set the value of this option to "YES"

5, press F10, save and exit

Copyright © Windows knowledge All Rights Reserved