Exchange server 2010 DAG failover without reason

Dear All,

i have an Exchange 2010 DAG two DAG at primary site and one at DR site,every 2 days my DAG failovers to another node in the primary site,i don't what is the reason i cannot find any event log also for this,can anyone tell how to find the reason for the switching of the DAG

Regards

September 2nd, 2015 3:00am

Hi,

Please check the failover cluster events which you can find in the system event log and more detail in the failover cluster logs. You may find that there are temporary network or server conditions that have caused the failover. 

A common scenario is failover due to a snapshot of a virtual mailbox server that is required by some backup applications e.g. Veeam or vRanger. Other scenarios may include memory issues, servers becoming unresponsive or transient network issues. 

If the cause cannot be resolved but is transient (up to 30s or so), you can increase the time required before failover occurs by configuring the cluster delay and threshold. The threshold is the number of heartbeats that need to be lost before failover and the delay is the time between heartbeats. There are delay and threshold settings for cross subnet clusters and same subnet clusters.

Commands are below:

cluster /cluster:<ClusterName> /prop SameSubnetDelay=<value>

cluster /cluster:<ClusterName> /prop SameSubnetThreshold=<value>

cluster /cluster:<ClusterName> /prop CrossSubnetDelay=<value>

cluster /cluster:<ClusterName> /prop CrossSubnetThreshold=<value>

Maximums (milliseconds) for Server 2008/2008 R2:

 

Same subnet:

cluster /prop SameSubnetDelay=2000:DWORD

cluster /prop SameSubnetThreshold=10:DWORD

 

Cross subnet:

cluster /prop CrossSubnetDelay=4000:DWORD

cluster /prop CrossSubnetThreshold=10:DWORD

 

Server 2012 has higher thresholds that can be set.

Let me know if this answers your question.

Thanks.

Free Windows Admin Tool Kit Click here and download it now
September 2nd, 2015 7:12pm

I like to note that this does not resolve the underlying issue, it is glossing over it.

If servers are coming out of the cluster there should be errors in the system event log - 1135 for example.

I wrote the below to help look for, and analyse such errors:

http://blogs.technet.com/b/rmilne/archive/2014/11/19/retrieving-cluster-error-1135-from-servers.aspx

The problem desctiption is vague, this could also refer to database copies moving around in the DAG.  That needs to be confirmed in detail by the OP.

September 3rd, 2015 1:34pm

Hi Abu,

Furthermore, here's an article about monitor DAG, for your reference:
https://technet.microsoft.com/en-us/library/dd351258(v=exchg.150).aspx

Free Windows Admin Tool Kit Click here and download it now
September 4th, 2015 5:57am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics