Exchange 2013 CU7 - Frequent Outlook 2013 Disconnections and Delays

We currently have the following environment:

Windows 2012 VM on Hyper-V with 24GB of RAM and 4 virtual procs (1 socket)
Cisco UCS, B200 M3 Blades
Exchange 2013 CU7
Exchange 2007 is in our environment, but public folders have been removed and only a few stale, unused accounts exist on it. 
Two dual-role CAS/Mailbox servers
Mailbox servers are in a DAG
MAPI/HTTP is our connection protocol
.NET Framework 4.5.2
Clients are Outlook 2013 SP1 on Windows 7 and 8.1
(Currently round robin DNS, but we have a hardware load balancer we've temporarily taken out of the mix to solve this problem)

What we're seeing is frequent disconnections from Exchange, regardless of the CAS server. That is, both CAS servers will disconnect clients, but not at the same time.  When we look at the Exchange boxes, we notice that the CPU is 99% - 100% each time the disconnections occur.  When we hunt down the IISWorker process causing the sudden CPU run, we see it comes back to the following app pools:

MSExchangeMAPIFrontEndAppPool
MSExchangeMAPIMailboxAppPool

(We were running RPC/HTTP, but switched to MAPI/HTTP to resolve the problem.  We had the same issue with equivalent RPC app pools at the time.)

Eventually, within a minute or two, the CPU load will decrease and the clients connected via that CAS will regain connection.  As you can imagine, the pain is felt more frequently by uncached clients--although everyone, regardless of caching, will see the disconnections.

Basically, the problem sounds exactly as shown in this KB:  http://support.microsoft.com/kb/2995145  Unfortunately, we already had .NET 4.5.2 installed, and have changed the environment variable and registry as shown in the article.  We're in contact with Microsoft Support, but thus far they're scratching their heads.

I'm clearing up all small errors to decrease the noise in the application error logs, but the most persistent one is Event ID 106 (MSExchange Common):

"Performance counter updating error. Counter name is Time in Resource per second, category name is MSExchange Activity Context Resources. Optional code: 2. Exception: The exception thrown is : System.InvalidOperationException: Instance 'ad-powershell-defaultdomain' already exists with a lifetime of Process.  It cannot be recreated or reused until it has been removed or until the process using it has exited."

Loading/reloading and recreation of the performance counters does not fix the problem, by Microsoft Support or us.  Of course, this error existed before we started having issues; I'd just like to clean it up just in case it's a contributing factor.

Has anyone seen anything similar? 


  • Edited by MAMP2 Monday, March 09, 2015 10:39 PM Misspelling/Added Link
March 9th, 2015 10:27pm

Hello! We are seeing a similar problem, were you ever able to resolve this?

What type of load balancer (if any) are you using?

Free Windows Admin Tool Kit Click here and download it now
April 2nd, 2015 12:52pm

We're using a Barracuda load balancer (641), but we've eliminated that as a source of the problem.  The CPU spikes and disconnections happen whether or not the load balancer is in use.

This problem is still ongoing, and we're still working with Microsoft Support.  Right now they're focused on our environment, but having combed through it numerous times, there's nothing that stands out. I suspect this is a bug in CU7 (yet I would readily accept our environment being at fault if it means we can identify and resolve the problem), but that's not the road Support is going down.

We've looked at storage I/O on the VM hosts (disk responses are 10 MS or less), added double the CPUs to each mail server (for a total of 8, each), disabled TCP chimney offload, run numerous Experfwiz log collections, etc...  Now the tech is focused on a particular mailbox database being the culprit because it also takes up more CPU when the MAPI app pools are gunning pretty high.  I suspect that it is a symptom and not the problem, but I have no choice but to follow his lead.

What is your environment like? Are you also CU7?  And did the problem show itself after a cumulative update was applied?

April 2nd, 2015 1:40pm

I started to worry about mailbox-to-database ratio as well, but our largest database has 400 mailboxes. (Admittedly, one mailbox ballooned to 35GB, but that's since been reduced to 6GB, and even that was an outlier.) We've been pretty conservative in our mailbox/database planning, with nowhere near 1,200 mailboxes per database, much less 10,000.  Our load, frankly, is not that exacting. Unless there's an oddity with network drivers that only made an appearance after CU7, I still suspect a bug in CU7.

I did remove Exchange 2007 from our environment this weekend, so we are no longer mixed.  That has reduced the weird event log spam (client access server couldn't reach a 2010 Exchange server [an Exchange server version we never had], and such), but I won't be able to determine whether this had an effect on disconnections until users start piling into the office on Monday and the load increases. (For the record, I doubt it will.  That being said, I do appreciate a lighter event log.)

I'd love to blame it on the load balancer, but our problems exist in round robin DNS as well.  So, our journey with MS support marches on.


  • Edited by MAMP2 18 hours 40 minutes ago
Free Windows Admin Tool Kit Click here and download it now
April 5th, 2015 8:46am

I started to worry about mailbox-to-database ratio as well, but our largest database has 400 mailboxes. (Admittedly, one mailbox ballooned to 35GB, but that's since been reduced to 6GB, and even that was an outlier.) We've been pretty conservative in our mailbox/database planning, with nowhere near 1,200 mailboxes per database, much less 10,000.  Our load, frankly, is not that exacting. Unless there's an oddity with network drivers that only made an appearance after CU7, I still suspect a bug in CU7.

I did remove Exchange 2007 from our environment this weekend, so we are no longer mixed.  That has reduced the weird event log spam (client access server couldn't reach a 2010 Exchange server [an Exchange server version we never had], and such), but I won't be able to determine whether this had an effect on disconnections until users start piling into the office on Monday and the load increases. (For the record, I doubt it will.  That being said, I do appreciate a lighter event log.)

I'd love to blame it on the load balancer, but our problems exist in round robin DNS as well.  So, our journey with MS support marches on.


  • Edited by MAMP2 Sunday, April 05, 2015 12:45 PM
April 5th, 2015 12:44pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics