SQL Cluster Missing Replication Registry Keys on Some nodes

Hi All

I have a 4 node SQL 2012 Cluster which has 16 SQL Cluster Roles and each role has it own SQL Instance.  The Cluster was setup about 3 years ago with only 2 nodes as the hardware for the 3rd and 4th node were being used for another reason at the time.  Node 3 was added about 18 months ago and node 4 was added about 3 months ago.  All hardware is the same, same Make, model, processor, NIC's, Hard Disks and External Storage.

When I added Node 4 I added Each SQL Cluster instance to the 4th node the same way we added SQL to node 2 and node 3 by running setup.exe /Action=AddNode /UpdateSource=<Path to Update Folder>.  All the instances install with no issue.  Once this was completed I tested failing over some of the roles to the new 4th node, then I started getting a problem moving some of the roles to node 4 and node 3.

For example, say I have Role "SQLServer1 (InstanceA)" running on node 1, perform live migration to node 4, Live migration fail's when trying to start SQL Server on node4, when it fails it tries a different node, this is normal node 1 or node 2, and starts no issue.

I checked the application event log and it just gave a general error so I looked at the SQL Server instance log which did not help either.  Next stage I get the Cluster log for node 4 and then looked through this for any issue's.  What I found was SQL Server was failing to start because it was unable to open Registry Key "HKLM\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL11.<InstanceName>\Replication".

So I open regedit and navigate to "HKLM\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL11.<InstanceName> and there was no "Replication" key, check on Node 1 and Node 2 and the keys are there, just not on node 4, some where also missing on Node 3.

To make sure what I was looking at was correct I manually added the "Replication" Key and all the Values under this key using node 1 as the reference server.  Once completed I tried to move the Role that had failed to move to Node 4 and now it did fail over to node 4 with no issue.

So my Problem is; why is it this Key is not being created on some of the nodes when I add a new instance to that node.  It is not a hard job to check for the key once the instance has been installed and add the key if missing.  But of the 16 Cluster Roles I have installed, only node 1 can have all the roles running on it due to this problem, Node 2 needs this key added for 2 instances, Node 3 needs this key added 8 instances and node 4 was the same as node 3 but I have manually added the registry keys to make sure the roles all fail.

Does anyone know why this happens?

I have found an article which has similar issue https://sqlcan.wordpress.com/2014/10/27/missing-registery-settings-in-cluster-nodes-for-sql-server/.  But I don't think this is the solution for me.

Richard Moth



July 31st, 2015 9:04am

Hi Richard,

Based on my research, SQL Server is a highly registry aware application. If any registry information (keys/values) changed for some unknown reasons, these changes can have serious impacts to SQL Server running on the cluster nodes.

And regarding to the issue that Replication registry keys are missing on the cluster nodes, the common solution is to enable the missing check-points for a particular SQL Server instance, which is described in this similar blog. Does the solution work for you?


Thanks,
Lydia Zhang

Free Windows Admin Tool Kit Click here and download it now
August 4th, 2015 3:36am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics