removing a cluster node with hardware faillure

Very simple issue.  In case of hardware unrecoverable issue, I have two msdn articles which states different things.

First one claims you remove the node from mscs. https://msdn.microsoft.com/en-us/library/ms181075(v=sql.100).aspx

Second one claims you should remove it using sql server installation and links to the first link which says you should do it from mscs: https://msdn.microsoft.com/en-us/library/ms189117(v=sql.100).aspx

Then this third article invalidates the second article. "To remove a node from an existing SQL Server failover cluster, you must run SQL Server Setup on the node that is to be removed from the SQL Server failover cluster instance."  https://msdn.microsoft.com/en-us/library/ms191545(v=sql.100).aspx#Remove.  It is a hardware faillure where the secondary node is innaccessible.

So what is the proper way to evict a node you cannot access due to a hardware faillure?

note: I don't plan on adding back the failed nodes after removing it.  i.e. I am insterested in the removing part.
  • Edited by Antoine F Monday, July 20, 2015 8:31 PM
July 20th, 2015 8:19pm

Apparently, I am not the only one inquiring about this:

https://connect.microsoft.com/SQLServer/feedback/details/332355/cannot-remove-a-cluster-node-from-the-sql-server-failover-cluster-definition-if-the-node-is-down

Microsoft answer (Alan Hirt): "You can evict the node using Windows Cluster Admin tool. SQL Server does not keep any internal state information about the cluster nodes. When a node is broken, such as due to hardware failure, SQL Server does not care. User can always remove the node by evicting it from Windows Cluster."

Free Windows Admin Tool Kit Click here and download it now
July 20th, 2015 8:51pm

Hi Antoine,

The above two links are describing two different actions.

first one is evicting the node from cluster. It means separating the one of the participant node from the cluster.

second one is more of how to remove the SQL fail over cluster from the one of the node.

There is a difference between removing the node from the cluster and uninstalling the sql server cluster from one of the node.

thanks

kumar

July 21st, 2015 1:59am

Thank you Kumar.

Both sections are named: How to recover from a failover cluster failure.

Both sections also apply to hardware faillure which means the node is not accessible.

So both articles do different thing but both articles are written to handle the same situation.

I am confused as to why the second article states we should uninstall the sql server cluster from the node in a situation where the node is not accessible and the uninstallation requires connecting to the node.

Free Windows Admin Tool Kit Click here and download it now
July 21st, 2015 11:36am

Hi Antoine,

Hardware failure in one node of a two-node cluster. This hardware failure could be caused by a failure in the SCSI card or in the operating system.

To recover from this failure, remove the failed node from the failover cluster using the SQL Server Setup program, address the hardware failure with the computer offline, bring the machine back up, and then add the repaired node back to the failover cluster instance.

Per my understanding, the above descriptions indicate that the node is still accessible when there is a hardware failure.

In this case, we need to uninstall SQL Server using the SQL Server Setup program which is described in the Remove Node section in this article, then evict Node 1 from Microsoft Cluster Service (MSCS) and take the computer offline, address the hardware failure.

Thanks,
Lydia

July 23rd, 2015 3:28am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics