Hmmm… can’t guarantee this is 100% correct, but it looks like there’s a problems with the above software when running against a Windows 2008 SP2 cluster. The storage part of OMSA doesn’t seem to get on very well with Windows Cluster.
Up until recently, our cluster had been behaving fine- reboot 1 node, the other picked up etc. However yesterday I installed OMSA 18.104.22.168, and this morning our SQL cluster fell over, badly. Not only that, but the nodes were taking forever to return from a reboot.
We spent a good 2 hours looking at this to get it working- the nodes would take ages to return but then wouldn’t even re-join the cluster properly until we restarted the cluster service, which isn’t ideal.
The Windows server faults were:
“FailoverClustering 1146: The cluster resource host subsystem (RHS) stopped unexpectedly. An attempt will be made to restart it. This is usually due to a problem in a resource DLL. Please determine which resource DLL is causing the issue and report the problem to the resource vendor.”
“FailoverClustering 1230: Cluster resource ‘Cluster Disk 1’ (resource type ”, DLL ‘clusres.dll’) either crashed or deadlocked. The Resource Hosting Subsystem (RHS) process will now attempt to terminate, and the resource will be marked to run in a separate monitor.”
There were a few bits and bobs on the web about 3rd party software, so I disabled all the OMSA services on the inactive node and rebooted… which was vastly quicker and didn’t flag these errors. Not having OMSA at all seemed extreme, so I just removed the storage part, rebooted and again, it rebooted much quicker, didn’t throw these errors and joined the cluster without needing the cluster service restarting. Just note that we have the full OMSA on a 2012 core cluster and it seems fine, so far this seems to be pointing at 2008.