Maximum Allowed Copy Queue Length

In a 3-site DAG, trying to activate a mailbox database copy. Since this was artificially suspended at one point - quite a few logs were backed up. As soon as the copy was resumed, I'm attempting a move operation to this database copy (which is now busy copying and replaying logs). It throws an error - which is understandable - since the target mailbox server has the default 'GoodAvailability' value for the AutoDatabaseMountDial, which is going to be used since I'm not specifying the -MountDialOverride on the Move-ActiveMailboxDatabaseCopy. This should translate in a maximum of 6 log files that are permitted in the copy queue - anything above this will fail the move. However, once the error comes back, it's referring to a queue of 10 logs (below).

There's no mention of 10 in the values used for the mount dial override behavior, however there is one regarding the DataMoveReplicationConstraint attribute that's set against a mailbox database. The functionality behind it is described here, however in this case each of the database copies is hosted by a mailbox server in a different AD site, with the mounted and a second copy being healthy, aside from the 3rd one that's copying/replicating. The setting against the database for the DataMoveReplicationConstraint attribute is 'SecondCopy'. This translates to at least one of the passive copies has to:

- Be healthy.
- Have a replay queue within 10 minutes of the replay lag time.
- Have a copy queue length less than 10 logs.
- Have an average copy queue length less than 10 logs. The average copy queue length is computed based on the number of times the application has queried the database status. 

Yet even though that's respected for the healthy, additional copy, I get the error below when trying to fail over to the one that's busy copying logs. The state of the mailbox copies after the move operation failed is in the 2nd picture.



July 13th, 2015 6:01pm

Ed, Oleg - that copy was artificially suspended, then resumed, specifically in order to get some logs in the copy queue and test the behavior of the failover based on the AutoDatabaseMountDial.

Oleg - thank you for the Test-ReplicationHealth, I didn't actually know this one existed and it checks quite a few things. Below is the result ran from the mailbox server hosting the copy with queues. As for Test-MRSHealth, this would test the Microsoft Exchange Mailbox Replication service, but to my understanding this is only used for mailbox moves, right ?

Just to restate - the DAG isn't broken, actually all 3 copies will be Healthy with 0 copy/replay queues under normal conditions. I was only artificially suspending one of them in order to test the implications of the failover and the various parameters involved. Oleg, the 2 articles you've pasted were actually used as study guide going forward with my tests. The only issue is why is there a reference to a copy queue limit of 10, when the only numeric values allowed for the dial override behavior are 0 (lossless), 6 (good availability) and 12 (best availability).

Free Windows Admin Tool Kit Click here and download it now
July 14th, 2015 8:02am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics