High disk utilization every 24 hours

Hi all,

While troubleshooting an user's complain about slow access to his mailbox on a Windows 2012/Exchange 2013 environment, I came across this interesting behavior:

Disk utilization on the server will increase from 100KBps to 3000KBps (on average) and stay high for a good 10 hours or so. Then disk activity will go down for exactly 24 hours from the time it was up and increase again for another cycle.


Exchange 2013

Basically, no matter how long it takes to go down, it will increase again 24 hours later. Looking at Perfmon and Resource Manager all I could see during both periods is that store.worker.exe is always on top, and READ has higher values. The issue does not appear to be related to business hours / backup window / client access to exchange, as it occurs at different times of the day and changes everyday.

I believe it has something to do with an internal maintenance/reporting/monitoring feature of Exchange 2013, as it happens on 3 Exchange environments we manage, completely apart from each other.

Microsoft Exchange Health Manager service has already been disabled as well as scheduled tasks under Microsoft/Windows/PLA and they all proved not to make a difference.

Any ideas out there?

December 6th, 2013 8:29am

Hi,

Do you have any backup program running at the same time

Considering your original issue, I would like to ask whether the performance issue occur to all the users or only a single user. If it occurs to a single user, then troubleshooting the database issue may not be the right direction.

Thanks,

Simon

Exchange Forum Support

Please remember to mark the replies as answers if they help and unmark them if they provide no help. If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com

Free Windows Admin Tool Kit Click here and download it now
December 9th, 2013 12:04pm

Hi Simon,

The issue was first noticed by a user, but at the moment I am more interested in understanding why Exchange server's disk utilization increases that much every 24 hours or so. We do have backup software running, but if that was the case, issue would happen at the same time everyday, which is not the case.

As per the chart, you sometimes see high values during the day and sometimes during the night. It will go up after 24 hours it went down last time.

December 10th, 2013 2:27am

Hi,

Please check other counters and try to find the ones which has the same table as this disk I/O table.

Thanks,

Please remember to mark the replies as answers if they help and unmark them if they provide no help. If you have feedback for TechNet Subscriber Support, contact tnsfl@microsoft.com

Free Windows Admin Tool Kit Click here and download it now
December 14th, 2013 10:59am

Disk read is the only counter that behaves this way. Narrowing it down, I see the disk which has the .edb files is having the long spikes. CPU / RAM / Network are all good with low values during all monitored period.

It seems to me that Exchange runs some kind of a maintenance task which reoccurs after 24 hours it last finished. The process it self (if any) does not hammer the server, but instead Store.worker is the one affected.

December 16th, 2013 3:35am

Any updates to this? Experiencing the same issues.
Free Windows Admin Tool Kit Click here and download it now
April 24th, 2014 2:41pm

We have the same issue. Exchange 2013 Service Pack 1 RU6 running on Windows Server 2012 R2. Every 24 hours we see a huge spike in disk queuing (going from < 1 to 50 or 60. More reads than writes usually. It lasts about an hour to two hours. It corresponds with exactly 24 hours from the last time a server was restarted. We see it on both nodes of our DAG. It is happening in 3 separate DAG's. We have disabled Search/Noderunner and the spike still happens. The process that appears to be generating the queue is Microsoft.Exchange.Store.Worker.exe.

Also - we have eliminated backups, virus scan software, etc all 3rd party stuff is out.

Has anyone found anything to explain the spikes?

 
February 4th, 2015 3:16pm

We have the same issue. Exchange 2013 Service Pack 1 RU6 running on Windows Server 2012 R2. Every 24 hours we see a huge spike in disk queuing (going from < 1 to 50 or 60. More reads than writes usually. It lasts about an hour to two hours. It corresponds with exactly 24 hours from the last time a server was restarted. We see it on both nodes of our DAG. It is happening in 3 separate DAG's. We have disabled Search/Noderunner and the spike still happens. The process that appears to be generating the queue is Microsoft.Exchange.Store.Worker.exe.

Also - we have eliminated backups, virus scan software, etc all 3rd party stuff is out.

Has anyone found anything to explain the spikes?

 

What do you see in the event logs when its spiked?

Im guessing its related to some process running under the "MSExchange Assistants" 

Free Windows Admin Tool Kit Click here and download it now
February 4th, 2015 3:44pm

We have the same issue. Exchange 2013 Service Pack 1 RU6 running on Windows Server 2012 R2. Every 24 hours we see a huge spike in disk queuing (going from < 1 to 50 or 60. More reads than writes usually. It lasts about an hour to two hours. It corresponds with exactly 24 hours from the last time a server was restarted. We see it on both nodes of our DAG. It is happening in 3 separate DAG's. We have disabled Search/Noderunner and the spike still happens. The process that appears to be generating the queue is Microsoft.Exchange.Store.Worker.exe.

Also - we have eliminated backups, virus scan software, etc all 3rd party stuff is out.

Has anyone found anything to explain the spikes?

 
February 4th, 2015 11:11pm

Good Call. I just spent the last few hours going over 550,000 events. 

I reviewed the application log for a 24 hour period and here is what I noticed:

During the high disk queue period there are lots of entries for MsExchangeMailboxAssistants. For each of the 15 mailbox databases. 

These only run during the High Disk Queues window.

TopN Words
Store DS Maintenance
Sharing Policy
SharePoint Signal Store
Search Index Repair Time Based 
People-centric triage 
People Relevance
Managed Folder Mailbox 
Mailbox Processor
Mailbox Association Replication
Calendar Repair
People Relevance Feeder
Orphaned Mailbox Purge

Here are the examples of the log entries for Event ID 9017

Service MSExchangeMailboxAssistants. TopN Words Assistant for database *** is entering a work cycle. There are *** mailboxes on this database.

Service MSExchangeMailboxAssistants. Store DS Maintenance Assistant for database *** is entering a work cycle. There are *** mailboxes on this database.

Service MSExchangeMailboxAssistants. Sharing Policy Assistant. for database *** is entering a work cycle. There are *** mailboxes on this database.

Service MSExchangeMailboxAssistants. SharePoint Signal Store Assistant for database *** is entering a work cycle. There are *** mailboxes on this database.

Service MSExchangeMailboxAssistants. Search Index Repair Time Based Assistant for database *** is entering a work cycle. There are *** mailboxes on this database.

Service MSExchangeMailboxAssistants. People-centric triage for database *** is entering a work cycle. There are *** mailboxes on this database.

Service MSExchangeMailboxAssistants. People Relevance Assistant for database *** is entering a work cycle. There are *** mailboxes on this database.

Service MSExchangeMailboxAssistants. Managed Folder Mailbox Assistant for database *** is entering a work cycle. There are *** mailboxes on this database.

Service MSExchangeMailboxAssistants. Mailbox Processor Assistant for database *** is entering a work cycle. There are *** mailboxes on this database.

Service MSExchangeMailboxAssistants. Mailbox Association Replication Assistant for database *** is entering a work cycle. There are *** mailboxes on this database.

Service MSExchangeMailboxAssistants. Calendar Repair Assistant for database *** is entering a work cycle. There are *** mailboxes on this database.

There are 3039 of these messages as well:

The people relevance feeder for mailbox *** picked 16 sent items to process. Number of sent items successfully processed 16 and number of sent items failed to be processed 0. Total time taken to process this mailbox 4539.8548.

 

It appears the purge of orphaned mailboxes happens only at this time as well .

 

 

These run throughout the day and not just 24 hours after the last restart/switchover.

Calendar Sync 
Junk Email Options 
Probe Time Based 
Store Maintenance 

Examples of the Event ID 9017

Service MSExchangeMailboxAssistants. Calendar Sync Assistant for database *** is entering a work cycle. There are *** mailboxes on this database.

Service MSExchangeMailboxAssistants. Junk Email Options Assistant for database *** is entering a work cycle. There are *** mailboxes on this database.

Service MSExchangeMailboxAssistants. Probe Time Based Assistant for database *** is entering a work cycle. There are *** mailboxes on this database.

Service MSExchangeMailboxAssistants. Store Maintenance Assistant for database *** is entering a work cycle. There are *** mailboxes on this database.

So how do I get the first group of maintenance tasks to run throughout the day, or at different times, or at least not all at the same time for all 15 databases?

Thanks!! 

Free Windows Admin Tool Kit Click here and download it now
February 6th, 2015 1:11pm

Good Call. I just spent the last few hours going over 550,000 events. 

I reviewed the application log for a 24 hour period and here is what I noticed:

During the high disk queue period there are lots of entries for MsExchangeMailboxAssistants. For each of the 15 mailbox databases. 

These only run during the High Disk Queues window.

TopN Words
Store DS Maintenance
Sharing Policy
SharePoint Signal Store
Search Index Repair Time Based 
People-centric triage 
People Relevance
Managed Folder Mailbox 
Mailbox Processor
Mailbox Association Replication
Calendar Repair
People Relevance Feeder
Orphaned Mailbox Purge

Here are the examples of the log entries for Event ID 9017

Service MSExchangeMailboxAssistants. TopN Words Assistant for database *** is entering a work cycle. There are *** mailboxes on this database.

Service MSExchangeMailboxAssistants. Store DS Maintenance Assistant for database *** is entering a work cycle. There are *** mailboxes on this database.

Service MSExchangeMailboxAssistants. Sharing Policy Assistant. for database *** is entering a work cycle. There are *** mailboxes on this database.

Service MSExchangeMailboxAssistants. SharePoint Signal Store Assistant for database *** is entering a work cycle. There are *** mailboxes on this database.

Service MSExchangeMailboxAssistants. Search Index Repair Time Based Assistant for database *** is entering a work cycle. There are *** mailboxes on this database.

Service MSExchangeMailboxAssistants. People-centric triage for database *** is entering a work cycle. There are *** mailboxes on this database.

Service MSExchangeMailboxAssistants. People Relevance Assistant for database *** is entering a work cycle. There are *** mailboxes on this database.

Service MSExchangeMailboxAssistants. Managed Folder Mailbox Assistant for database *** is entering a work cycle. There are *** mailboxes on this database.

Service MSExchangeMailboxAssistants. Mailbox Processor Assistant for database *** is entering a work cycle. There are *** mailboxes on this database.

Service MSExchangeMailboxAssistants. Mailbox Association Replication Assistant for database *** is entering a work cycle. There are *** mailboxes on this database.

Service MSExchangeMailboxAssistants. Calendar Repair Assistant for database *** is entering a work cycle. There are *** mailboxes on this database.

There are 3039 of these messages as well:

The people relevance feeder for mailbox *** picked 16 sent items to process. Number of sent items successfully processed 16 and number of sent items failed to be processed 0. Total time taken to process this mailbox 4539.8548.

 

It appears the purge of orphaned mailboxes happens only at this time as well .

 

 

These run throughout the day and not just 24 hours after the last restart/switchover.

Calendar Sync 
Junk Email Options 
Probe Time Based 
Store Maintenance 

Examples of the Event ID 9017

Service MSExchangeMailboxAssistants. Calendar Sync Assistant for database *** is entering a work cycle. There are *** mailboxes on this database.

Service MSExchangeMailboxAssistants. Junk Email Options Assistant for database *** is entering a work cycle. There are *** mailboxes on this database.

Service MSExchangeMailboxAssistants. Probe Time Based Assistant for database *** is entering a work cycle. There are *** mailboxes on this database.

Service MSExchangeMailboxAssistants. Store Maintenance Assistant for database *** is entering a work cycle. There are *** mailboxes on this database.

So how do I get the first group of maintenance tasks to run throughout the day, or at different times, or at least not all at the same time for all 15 databases?

Thanks!! 

Well, its all workload based, so I don't think you can readily adjust that.

I'm more inclined to look at the specs of the mailbox servers. Are they up to the task and have they been vetted with the Mailbox Calculator?

February 8th, 2015 8:47am

I think we are more than over spec'd.

We were running fine on just 2 Exchange servers in a DAG when we were on Exchange 2010. We were then running on 2 Dell Power Edge 520's that are almost 4 years old now. The Exchange Servers were Virtuals on these hosts.

We installed fresh on 2 new Dell Power Edge R720xd servers.  Dual E5-2670 CPU's, 196 GB of Memory. Dedicated drives for the OS of the Hyper-V host and 12 SAS 3 TB hard drives in a Raid 6 for the disk array for the virtual.

The Exchange Servers are installed as Generation 2 Virtual Hosts with 128 GB of memory.

We experienced this slowness issue and thought we may be underspec'd somehow - and so we added 2 more Power Edge R720xd servers - and still see the same problem. The virtuals run great 23 hours out of the day. Just the one hour when all the maintenance tasks run at the same time for all the databases that we run into crazy high disk queues.

I guess I am confused by your statement - that it is workload based? What does that mean? I am wondering why I can't schedule the maintenance to happen at specific times - not just 24 hours after the last restart. Also , why can't I schedule certain databases to run maintenance at separate times - instead of all 15 at once?

I have been working with Microsoft Support on this for the past 2 months - they don't have an answer yet. We have gone over specs and there only recommendation is to separate all the databases out on to separate disk. I am confused by this. 

Free Windows Admin Tool Kit Click here and download it now
February 9th, 2015 3:28pm

Hi,

same Problem here (Exchange 2013 CU6 / DAG Environment). Every 24 hours (depands on Servers boot time) i have really high CPU Load and IO for about 30 minutes!

I'm really sure that it has something to do with the Service "MSExchangeMailboxAssistants.exe".

If you check the Logfiles for this Service "at Load time" you see massive growth of actions: 

C:\Program Files\Microsoft\Exchange Server\V15\Logging\MailboxAssistantsLog

The Configuration seems to be the on "Mailbox Server" => "Work Cycle" Basis: 

get-mailboxserver SERVERNAME |select *work*

I tried to set a dedicated Maintanence Window (in the night) for this MSExchangeMailboxAssistants:

set-mailboxserver SERVERNAME -ManagedFolderAssistantSchedule

=> But the Problem still occurs. (I did not have tried to reset all services after reconfiguration ManagedFolderAssistantSchedule yet)

Any help would be appreciated. 

Thanks, 

Martin 


March 22nd, 2015 7:17am

So we opened a case with Microsoft and you cannot set the work cycles to a specific time. You can just set to run at set intervals. We played with all the different work cycles and found that it was the ManagedFolderWorkCycle that was generating the high IO for us. It would run for all the databases at the same time every 24 hours.

Microsoft said we couldn't schedule this, nor could we set it to only run for one database at a time. It is a feature that it runs on all databases at the same time at whatever interval you set to have it run after the latest reboot.

The default is to run every 24 hours. I found that if I ran it every 1 hour - the load was divided up across the 24 times that it ran throughout the day and the IO wasn't as bad. Not a good solution - but the only one available to me.

So here was the script I ran:

Set-MailboxServer 'servername' -ManagedFolderWorkCycle 0.01:00:00 -ManagedFolderWorkCycleCheckpoint 1.00:00:00

It may be one of the other work cycles that is causing you grief - so you can adjust the schedules for each and see when it spikes.

Here is a script to see what each is set to

Get-MailboxServer "servername" | fl *cycle*

Good luck!

Free Windows Admin Tool Kit Click here and download it now
March 23rd, 2015 10:34am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics