W3WP.exe consuming 91 - 99% CPU every 5 minutes
Exchange 2007 (8.1) running on a Windows Server 2k3 SP2 cas&hub. I've already attempted restarting the W3WP.EXE service. However this is only a temporary fix as the high CPU consumption returns within 5 minutes. We are using Exchange ActiveSync and whenever the W3WP.EXE service is allowed to sit at high CPU consumption for a long period of time we will receive reports that EAS devices (Apple and Android) have trouble sending/receiving their corporate email. Any advice in diagnosing the cause of this issue would be greatly appreciated. -Mon
December 5th, 2011 7:39pm

You should have Exchange 2007 SP3 plus the latest rollups installed.
Free Windows Admin Tool Kit Click here and download it now
December 5th, 2011 10:02pm

On Mon, 5 Dec 2011 19:02:31 +0000, A_D_ wrote: > > >You should have Exchange 2007 SP3 plus the latest rollups installed. .. . . and updated firmware on the mobile devices! --- Rich Matheisen MCSE+I, Exchange MVP --- Rich Matheisen MCSE+I, Exchange MVP
December 6th, 2011 4:53am

Upgrading to Exch 2007 SP3 is something I'm going to work on but cannot perform anytime soon. I'm using Exchange User Monitor to attempt to identify user names with high CPU% so that I can verify they have the latest firmware. Our environment has hundreds of users on ActivSync. Is there a reasonably easy way to find out the iOS / Android versions of the connected users? -Mon
Free Windows Admin Tool Kit Click here and download it now
December 6th, 2011 3:13pm

On Tue, 6 Dec 2011 20:05:01 +0000, Monarlais wrote: > > >Upgrading to Exch 2007 SP3 is something I'm going to work on but cannot perform anytime soon. I'm using Exchange User Monitor to attempt to identify user names with high CPU% so that I can verify they have the latest firmware. > >Our environment has hundreds of users on ActivSync. Is there a reasonably easy way to find out the iOS / Android versions of the connected users? This probably hasn't been updated with new versions of iOS, but it should give you an idea of what's going on. It does, of course, assume that you have the IIS logs to parse: http://gsexdev.blogspot.com/2010/08/parsing-iis-log-activesync-traffic-for.html --- Rich Matheisen MCSE+I, Exchange MVP --- Rich Matheisen MCSE+I, Exchange MVP
December 6th, 2011 5:59pm

On Tue, 6 Dec 2011 20:05:01 +0000, Monarlais wrote: > > >Upgrading to Exch 2007 SP3 is something I'm going to work on but cannot perform anytime soon. I'm using Exchange User Monitor to attempt to identify user names with high CPU% so that I can verify they have the latest firmware. > >Our environment has hundreds of users on ActivSync. Is there a reasonably easy way to find out the iOS / Android versions of the connected users? This probably hasn't been updated with new versions of iOS, but it should give you an idea of what's going on. It does, of course, assume that you have the IIS logs to parse: http://gsexdev.blogspot.com/2010/08/parsing-iis-log-activesync-traffic-for.html --- Rich Matheisen MCSE+I, Exchange MVP --- Rich Matheisen MCSE+I, Exchange MVP
Free Windows Admin Tool Kit Click here and download it now
December 7th, 2011 1:51am

Upgraded the environment to SP3 and the latest roll-up (5) however the issue is still occurring. Using powershell I don't see any clients using iOS 4.0 which I understand is known to cause problems. I did perform an analysis of a memory dump collected while W3WP.exe was spiked. I did this using MS Debugger Tool version 1.2 The result was very odd. It shows several thousand client connections have been executing requests for more than 90 seconds. However, we only have about 450 unique devices on ActiveSync in our environment. Why would there be several thousand connections executing requests longer than 90 seconds?? Looking at the list of user names and device IDs of the client connections I see multiple instances of "repeat offenders" although I'm not too sure what this means in relation to my problem. Furthermore, it shows that 14% of w3wp threads were blocked because they were waiting for .NET garbage collection to finish. Using .NET Memory counters I've gathered the following: # Gen 0 Collections - 70,000 # Gen 1 Collections - 17,500 # Gen 2 Collections - 3,000 #Induced GC - 2,700 % Time in GC - varies between 0.4 to 1.3 --------------------------------- I'm not sure if garbage collection is a cause of this issue or not. Also I'm equally concerned about the high number of client connections executing requests.
December 10th, 2011 12:26pm

On Sat, 10 Dec 2011 17:21:05 +0000, Monarlais wrote: >Upgraded the environment to SP3 and the latest roll-up (5) however the issue is still occurring. Using powershell I don't see any clients using iOS 4.0 which I understand is known to cause problems. > >I did perform an analysis of a memory dump collected while W3WP.exe was spiked. I did this using MS Debugger Tool version 1.2 > >The result was very odd. It shows several thousand client connections have been executing requests for more than 90 seconds. However, we only have about 450 unique devices on ActiveSync in our environment. Why would there be several thousand connections executing requests longer than 90 seconds?? If the devices are using the schedule "As items arrive" they maintain a constant connection to the server. If those connections are broken the device will initiate a new connection. The "abandoned" connection will eventually be reset. >Looking at the list of user names and device IDs of the client connections I see multiple instances of "repeat offenders" although I'm not too sure what this means in relation to my problem. It may mean that your firewall is terminating "inactive" connections before ActiveSync terminates them. http://support.microsoft.com/kb/905013 >Furthermore, it shows that 14% of w3wp threads were blocked because they were waiting for .NET garbage collection to finish. Using .NET Memory counters I've gathered the following: > ># Gen 0 Collections - 70,000 > ># Gen 1 Collections - 17,500 > ># Gen 2 Collections - 3,000 > >#Induced GC - 2,700 > >% Time in GC - varies between 0.4 to 1.3 > >--------------------------------- > >I'm not sure if garbage collection is a cause of this issue or not. I don't think so, but it may be causing other issues. >Also I'm equally concerned about the high number of client connections executing requests. --- Rich Matheisen MCSE+I, Exchange MVP --- Rich Matheisen MCSE+I, Exchange MVP
Free Windows Admin Tool Kit Click here and download it now
December 10th, 2011 1:15pm

>Looking at the list of user names and device IDs of the client connections I see multiple instances of "repeat offenders" although I'm not too sure what this means in relation to my problem. It may mean that your firewall is terminating "inactive" connections before ActiveSync terminates them. http://support.microsoft.com/kb/905013 --- Rich Matheisen MCSE+I, Exchange MVP --- Rich Matheisen MCSE+I, Exchange MVP So to make sure I understand this correctly let's use the following example: 1) My firewall time-out value for HTTP(S) requests to the Exchange Server Microsoft-Server-ActiveSync virtual directory is set to 3 minutes 2) ActiveSync heartbeat intervals are: Min: 60 (1 minute) Max: 1750 (30 minutes) A user makes a client connection request and opens a session. The firewall terminates this connection after 3 minutes. At 4 minutes if the same user attempts to make an HTTP(S) request to ActiveSync then it will open a new session even though the other one exists? The old session remains open until the maximum heartbeat internval closes it? Over time this would lead to numerous amounts of open sessions? Please let me know if I am not understanding this correctly.
December 10th, 2011 5:06pm

On Sat, 10 Dec 2011 22:00:56 +0000, Monarlais wrote: >>Looking at the list of user names and device IDs of the client connections I see multiple instances of "repeat offenders" although I'm not too sure what this means in relation to my problem. It may mean that your firewall is terminating "inactive" connections before ActiveSync terminates them. http://support.microsoft.com/kb/905013 --- Rich Matheisen MCSE+I, Exchange MVP >--- Rich Matheisen MCSE+I, Exchange MVP > >So to make sure I understand this correctly let's use the following example: > >1) My firewall time-out value for HTTP(S) requests to the Exchange Server Microsoft-Server-ActiveSync virtual directory is set to 3 minutes > >2) ActiveSync heartbeat intervals are: > >Min: 60 (1 minute) > >Max: 1750 (30 minutes) > >A user makes a client connection request and opens a session. The firewall terminates this connection after 3 minutes. At 4 minutes if the same user attempts to make an HTTP(S) request to ActiveSync then it will open a new session even though the other one exists? "Another one" isn't the same one as the "new one". You'll find the same sort of thing with Outlook users that use WiFi and move around from one Access Point to another. The "old" connection is abandoned (not terminated) and then a new one is created from a new Access Point. That new connection isn't going to be able to use the same socket so there's no way to say "Hey -- I'm really reusing this old session!" >The old session remains open until the maximum heartbeat internval closes it? Over time this would lead to numerous amounts of open sessions? Pretty sure. Why not increase the HTTP/S timeout and see what happens? --- Rich Matheisen MCSE+I, Exchange MVP --- Rich Matheisen MCSE+I, Exchange MVP
Free Windows Admin Tool Kit Click here and download it now
December 10th, 2011 5:43pm

After adjusting the heartbeat intervals it appears the problem remains unchanged. The W3WP.exe process rises just as fast as it did before. Performing more memory dumps of the service along the way I see that all of the connections are not dropping before the MaxHeartbeatInterval time period. The number of client connections only continues to grow with each restart of the W3WP service. I've even tried adding the "executionTimeout" variable to the web.config to see if that would stabilize the number of client connections. This attempt was unsuccessful. With a min heartbeat of 120 seconds, max heartbeat of 300 and an executionTimeout of 90 I'm still seeing client connections whose "Time Alive" field within the W3WP.exe memory dump contain values of 30+ minutes. In addition, these clients continue to create duplicate connections over time resulting in an infinitely increasing number of client connections until the W3WP process just crashes.
December 11th, 2011 3:46am

On Sun, 11 Dec 2011 08:40:30 +0000, Monarlais wrote: > > >After adjusting the heartbeat intervals it appears the problem remains unchanged. > >The W3WP.exe process rises just as fast as it did before. Performing more memory dumps of the service along the way I see that all of the connections are not dropping before the MaxHeartbeatInterval time period. > >The number of client connections only continues to grow with each restart of the W3WP service. I've even tried adding the "executionTimeout" variable to the web.config to see if that would stabilize the number of client connections. This attempt was unsuccessful. > >With a min heartbeat of 120 seconds, max heartbeat of 300 and an executionTimeout of 90 I'm still seeing client connections whose "Time Alive" field within the W3WP.exe memory dump contain values of 30+ minutes. In addition, these clients continue to create duplicate connections over time resulting in an infinitely increasing number of client connections until the W3WP process just crashes. Have you increased the timeout for your firewall, too? --- Rich Matheisen MCSE+I, Exchange MVP --- Rich Matheisen MCSE+I, Exchange MVP
Free Windows Admin Tool Kit Click here and download it now
December 11th, 2011 1:52pm

Rich, Thanks for your assistance. After much troubleshooting and investigation I've traced this problem down to a single iPad-2 device. I'm not sure what is special about this iPad but I am 100% certain it is the device. The iPad-2 is on the latest build of iOS 5 (5.0.1) which is not a unique setup on this network. It's very odd and will definitely require more investigation but that's all the news I have.
December 14th, 2011 10:28am

Rich, Thanks for your assistance. After much troubleshooting and investigation I've traced this problem down to a single iPad-2 device. I'm not sure what is special about this iPad but I am 100% certain it is the device. The iPad-2 is on the latest build of iOS 5 (5.0.1) which is not a unique setup on this network. It's very odd and will definitely require more investigation but that's all the news I have. Monarlais, How did you find out the what device was the offending one? We have the same issue here and I would like to find out the same. Thanks in advance for your help. Koenraad.
Free Windows Admin Tool Kit Click here and download it now
December 19th, 2011 11:20am

Rich, Thanks for your assistance. After much troubleshooting and investigation I've traced this problem down to a single iPad-2 device. I'm not sure what is special about this iPad but I am 100% certain it is the device. The iPad-2 is on the latest build of iOS 5 (5.0.1) which is not a unique setup on this network. It's very odd and will definitely require more investigation but that's all the news I have. Monarlais, How did you find out the what device was the offending one? We have the same issue here and I would like to find out the same. Thanks in advance for your help. Koenraad. We have been able to find the offending user by checking the IIS logs and monitoring the MBX servers with ExMon. In our case, it was an iPhone 4S with the latest iOS 5.0.1. Also not a unique setup on our network. Seems like Microsoft and/or Apple will need to come up with a fix.
December 21st, 2011 1:09pm

On Wed, 21 Dec 2011 18:02:45 +0000, connebeest wrote: > > >Rich, > >Thanks for your assistance. After much troubleshooting and investigation I've traced this problem down to a single iPad-2 device. > >I'm not sure what is special about this iPad but I am 100% certain it is the device. The iPad-2 is on the latest build of iOS 5 (5.0.1) which is not a unique setup on this network. It's very odd and will definitely require more investigation but that's all the news I have. > >Monarlais, > >How did you find out the what device was the offending one? > >We have the same issue here and I would like to find out the same. > >Thanks in advance for your help. > > > >Koenraad. > >We have been able to find the offending user by checking the IIS logs and monitoring the MBX servers with ExMon. In our case, it was an iPhone 4S with the latest iOS 5.0.1. Also not a unique setup on our network. Seems like Microsoft and/or Apple will need to come up with a fix. A fix for what? --- Rich Matheisen MCSE+I, Exchange MVP --- Rich Matheisen MCSE+I, Exchange MVP
Free Windows Admin Tool Kit Click here and download it now
December 21st, 2011 5:26pm

On Wed, 21 Dec 2011 18:02:45 +0000, connebeest wrote: > > >Rich, > >Thanks for your assistance. After much troubleshooting and investigation I've traced this problem down to a single iPad-2 device. > >I'm not sure what is special about this iPad but I am 100% certain it is the device. The iPad-2 is on the latest build of iOS 5 (5.0.1) which is not a unique setup on this network. It's very odd and will definitely require more investigation but that's all the news I have. > >Monarlais, > >How did you find out the what device was the offending one? > >We have the same issue here and I would like to find out the same. > >Thanks in advance for your help. > > > >Koenraad. > >We have been able to find the offending user by checking the IIS logs and monitoring the MBX servers with ExMon. In our case, it was an iPhone 4S with the latest iOS 5.0.1. Also not a unique setup on our network. Seems like Microsoft and/or Apple will need to come up with a fix. A fix for what? --- Rich Matheisen MCSE+I, Exchange MVP --- Rich Matheisen MCSE+I, Exchange MVP I meant a fix for the high CPU usage caused by Apple devices. But in our case it was solved by recreating the synchronisation on the offending iPhone. I would have thought it was a more profound issue.
December 23rd, 2011 11:01am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics