Passive Database Healthy but CopyQueue Length at 9223372036854769389

Hi Guys

I've some questions about my Exchange 2013 DAG.

I'm having two Exchange 2013 in two different Sites (also a Witness Server in Site C):

Site A -- Hex01 -- 10.1.1.10/24 Site B -- Hex02 -- 10.2.1.10/24

This two Sites are hosting a Database named "DAGDB1".

The Internet Connection at Site A sometimes goes down, so the Database has to Failover to Site B, sometimes Back because of Problems in Site B.

Now after some days the Database on Site B shows "Passive Healthy" and Index: Healthy but the CopyQueue Length in Site B is "9223372036854769389" and the active Database can't be moved to Site B.

Is there something i can tune so the DAG is not sensitive? After a manual Reseed the Database works as excepted (with the Option -DeleteExistingFiles)

Also the Reseed is working with about 30MBit/s, the Connections are 1Gbit Upload on Site A and 100Mbit Download on Site B, why this slow replication?

Regards and thanks for some answers,

Michael


  • Edited by Michi-ch Saturday, May 23, 2015 11:16 PM
May 23rd, 2015 11:15pm

Dear Micheal,

i would better suggest you to Fix the N/W link which is very important in all aspects.

However in DAG perspective you can enable the DAC mode which is recommended to have it enabled.

So what will happen in your scenario after enabling DAC mode :-

1) All active copies of site A will be activated on site B when the n/w in site A flips down.

2) Now the active copies will remain in the site B until and unless you manually activate them on site A even though the site A becomes active.

So the  best solution for your scenario would be the above  until the LAN/WAN link issue is fixed.

Free Windows Admin Tool Kit Click here and download it now
May 24th, 2015 6:38am

Hi Michael,

Thank you for your question.

We could run the following command to check the status for all databases copies on the mailbox server where is in site B:

Get-MailboxDatabaseCopyStatus -Server <siteB mailbox server> | Format-List

We could restart the following services to check if the issue persist:

Microsoft Exchange Information Store

Microsoft Exchange Replication

This error was caused by the following reasons:

  1.        The Cluster service on the server hosting the active copy might be having a problem writing updates even though the node remains in cluster membership.
  2.        The Cluster service on the server hosting the passive copy might be having a problem receiving updates even though they remain in cluster membership.
  3.        The Information Store service and Exchange Replication service could be stopped on the server hosting the active copy. (Remember that a copy that is active simply signifies the node that owns the copy not the actual mount / dismount state of the database).
  4.        A datacenter switchover is being performed and more than 12 minutes have elapsed between the time when the failed DAG members were stopped and when the remote DAG members were activated.

We could refer to the following link to solve it:

http://blogs.technet.com/b/timmcmic/archive/2012/05/30/exchange-2010-the-mystery-of-the-9223372036854775766-copy-queue.aspx

Notice: although this link is about Exchange 2010, but It is adapt for Exchange 2013.

If not, we could check if there are any errors in application log and post them to ibsexc@microsoft.com for our troubleshooting.

If there are any questions regarding this issue, please be free to let me know.

Best Regard,

Jim

May 25th, 2015 1:58am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics