Remember that the AD domain domain administrator password change caused a system cluster manager to be trouble-shooted.

  
Environment Description A company has a system, two Windows2003 systems, and the built-in cluster manager function to implement dual-system hot backup function. Next, the server 10.1.1.1 carries the middleware application service, and the server 10.1.1.2 hosts the database service, and the resource is switched to another one only when one of the servers fails. System Name System Version IP Address Remarks S-EIP-APP Windows2003ENTSP2 10.1.1.1 Using Windows 2003's own Cluster Manager to implement dual-system hot backup function S-EIP-DATA Windows2003ENTSP2 10.1.1.2 Failure performance Receive SMS alert one night, system Abnormal, inaccessible. Log in to the system 10.1.1.1 to view the cluster manager status and find that the server 10.1.1.2 is out of the cluster state. Processing and Analysis Process 1) Remotely log in to the system 10.1.1.1, open the cluster manager, and discover that the “S-EIP-DATA” database server node is faulty and displays “Red Fork”, “EAIEIP” in the activity resource, “&” ;Oracle Services for MSCS",“OracleOraDb10g_home1TNSListenerFsloracle-vip” The state is in "Failure", the cluster manager attempts to automatically transfer the failure to the "S-EIP-APP" node, but the fault remains. 2) Try to solve the 10.1.1.2 server restart system, the problem remains. 3) Log in to the 10.1.1.2 server, view the event log, and find the alarm information and error message: “The security system detected an authentication error on the server DNS/s-xx1.hq.cxxp.xxx from the authentication protocol kerbers. The failure code is invalid for login, possibly due to invalid username or verification message, as shown in the figure: 4) As seen from the above log, there is an error “Unknown username and password error”. Because the Windows 2003 cluster manager uses the AD domain for centralized management, the AD domain administrator is informed that the administrator password of the AD domain has been changed before. 5) At this point, the focus is on how to modify the latest password of the AD domain in the cluster manager of the portal. After reviewing the official KB and online materials, try to change the password of the AD domain in the Cluster Manager: Log in to 10.1.1.1 and 10.1.1.2 respectively, and modify the "cluster service>;--Login--Change Password in “Services" , as shown in the figure: 6) After modifying the password in the “cluster service” service, the problem remains. Check the “Oracle Services for MSCS” in the cluster manager in the “S-EIP-DATA”; Oracle service for MSCS  The log knows that it is still a user password problem, as shown in the figure: 7) Solution: Modify the “Login password” in the “Services” in 10.1.1.1 and 10.1.1.2, respectively, as shown in the figure: 8) Modify completed After the cluster manager "S-EIP-DATA", the resources in the activity resources are still in a failed state, as shown in the figure: 9) As can be seen from the above figure, the oracle fail safe failover cluster has an incorrect user password. According to the above experience, the oracle fail safe should also be the administrator password of the AD domain, try to change the password: Login 10.1.1.2--Start--Program-- “oracle-ofs34_home1”--"oracle services for MSCS Security Setup", modify the new AD domain administrator password. 10) After all the passwords related to the AD domain administrator account are changed to be consistent with the AD domain, the problem is solved, the cluster manager function is normal, and the service is restored. The cause of the fault is caused by the user password of the AD domain domain administrator being directly used in the previous planning. However, the domain administrator password of the AD domain server is modified, but the system does not change accordingly. The failure occurred.
Copyright © Windows knowledge All Rights Reserved