Agent Processor Time Alert (Network Steve Forum)

Agent Processor Time Alert

Hello, after installing the latest Operation Manager 2007 R2 management packs I started getting an alert off and on. This alert is only appearing on low key servers (servers that don't do much or sometimes just sit idle all day). Has anyone else seen this? Is it really an issue, or is there some tuning I should do?Alert: The Operations manager agent processes are using too much processor timeDescription: The total processor utilization of all agent processes has exceeded the threshold over multiple samples.I am not sure how the monitor exactly works, but my thought is, since everything else on the box is doing nothing (or almost nothing), the ratio for the opsmgr agent may be falsely shown as too high, even though it really isn't a problem. Thoughts? Thanks!

March 12th, 2010 12:53am

Hi,The monitor configuration information is built in to the the product knowledge - it says that the default threshold is 25% of CPU time for 6 consecutive samples (a sample is taken every 5 minutes):Check out the recommendations in the below text, there is a lot there. Also consider enabling the diagnostic task to get more information on CPU utilisation in the alert.MattHere is the Product knowledge detail:=============================================================================== Summary This monitor calculates the total CPU utilization of the Operations Manager agent and its related processes, and generates an alert when CPU utilization exceeds a specified threshold for a specified number of consecutive samples. The monitor’s underlying script works by locating and sampling the CPU utilization for the Operations Manager agent process (HealthService.exe), its child monitoring host process (MonitoringHost.exe) and the child processes of those monitoring host processes (cscript.exe, PowerShell.exe, etc.). The script runs the calculation three times and outputs the average of the three consecutive samples, which is then used by this monitor to determine critical or healthy state. Configuration You can use overrides to customize the following parameters to alter the default behavior of this monitor: • Frequency (seconds). This is the frequency at which the monitor samples agent processor utilization. By default, the monitor evaluates the agent processor utilization every 300 seconds (5 minutes). • Number of consecutive samples for critical state. By default, this monitor reports a critical state when 6 consecutive samples exceed the specified threshold. • Number of consecutive samples for healthy state. By default, this monitor returns a healthy state when 3 consecutive samples are under the specified threshold. • Threshold. By default, the threshold for CPU utilization is 25%. This monitor is disabled by default for all management servers. Causes Excessive CPU utilization of the various Operations Manager agent processes may indicate that agent or one of its underlying dependencies is not operating properly. If the agent and its underlying dependencies are updated properly, then the agent is being over-utilized on the system being monitored. This may be short-lived, due to a recent update in the management group, such as the deployment of a new management pack, or this may be due to the agent truly being under excessive load, in which case tuning may be required. Resolutions To ensure that the agent and its underlying dependencies are operating properly, check the following: • Verify that the most recent version of the Operations Manager agent is installed on the system. • Verify that the patch for MSXML 6.0 provided in Knowledge Base article 968967 (http://go.microsoft.com/fwlink/?LinkId=181885) is installed. • If the system's operating system is Windows XP, Windows 2000 Server or Windows Server 2003, ensure that the system is running Windows Script Host 5.7 or later. The following link provides the download locations for Windows Script Host 5.7 http://go.microsoft.com/fwlink/?LinkId=181884. If the condition persists after those configurations are verified, then deeper investigation is required to understand what is driving CPU utilization. Investigate further using any combination of the following steps: • Review the recent history of agent processor utilization, workflow count, and module counts using the following view: Agent Performance View. The agent processor utilization data will give insight into whether the issue is recent or has been occurring for a longer period of time. The workflow and module count data will give an indication of the workload that the various rules, monitors, and discoveries are putting on the agent. This data should also be compared against healthy agents to use as a contrast. • Use a tool such as the Effective Configuration Viewer (http://go.microsoft.com/fwlink/?LinkId=182300) to understand the number of class instances discovered on the agent. More class instances can lead to higher workflow and module counts, which can result in more workload. • Using Performance Monitor, collect more detailed % Processor Time measurements from the Process object. This will give insight as to which processes are contributing the most significantly to overall processor utilization. • Review any recent management pack updates or changes to see if they correspond with the increase in CPU utilization. When the cause or causes are identified, any one of the following steps may be taken to address the issue: • If a management pack change was made recently or a new management pack was deployed, monitor the situation to see if the problem continues. • Reduce the frequency of discoveries via overrides to spread out their CPU utilization across the day. Doing this comes at the trade-off of discovery potentially taking longer to occur. • Reduce the frequency of rules or monitors that are run on a schedule to spread their CPU utilization across the day. Doing this comes at the trade-off of monitoring. • If the agent is managed by multiple management groups (a configuration referred to as “multi-homed”), that will contribute to higher processor utilization as well. Consider reducing the number of management groups that the agent is managed by. If all of the steps above do not produce a solution, contact Microsoft Customer Service and Support (http://support.microsoft.com/). Additional Information This monitor has a related diagnostic task, “Collect agent processor utilization diagnostic”, which reruns the sampling of CPU utilization. The diagnostic task is disabled by default. There is also a task in the Operations console, ”Get the agent processor utilization”, which reruns the sampling of CPU utilization. When you run the ”Get the agent processor utilization” task, you can set the time-out and number of samples parameters. The task returns a table of results. Run the Get the 'agent processor utilization' task Matt White( http://systemcenterblog.hardac.co.uk/ )

Free Windows Admin Tool Kit Click here and download it now

March 12th, 2010 11:57am

Thanks for the reply! I have looked over these items, the only item that might apply is MSXML 6.0. This is happening on a lot of servers running SQL 2008 with Windows Server 2008 R2 (not sure if that hotfix applies to those versions?). However, when I log into the server, CPU Usage is at 0-1%, and monitoringhost is using 0% of CPU, and system idle has 98-99%. Even during this scenario, the alert is still open and usually stays open for at least an hour or so and I have never noticed the monitoringhost ever leave 0%. Thoughts?

March 15th, 2010 8:54pm

Hi,Firstly, I would definately make sure that all the recommended patches are installed.Next, I would perhaps set up perfmon to log the utilisaiton of monitoringhost.exe.I would also consider enabling the diagnostic task associated with this monitor. You can do this by opening the monitor and and going to the 'Diagnostic and Recovery' tab. Select the diagnostic and click edit, then go to the overrides tab. Set the Enabled parameter to True.When you next get the alert, there will be some extra information on the utilisation of the processor.MattMatt White ( http://systemcenterblog.hardac.co.uk/ )

Free Windows Admin Tool Kit Click here and download it now

March 16th, 2010 12:15am

Hey, thanks for the reply!I confirmed this hotfix does not apply to my boxes in question.I setup perfmon to log the utilization of monitoringhost.exe, it sits at 0% almost all the time (occasionally increasing to 2-3%).I also enabled the diagnostic task, the results are: Average: 0; Maximum: 0; Minimum: 0.The monitor still freqently generates alerts, and they stay open for an hour usually. Thoughts?Thanks!

March 16th, 2010 5:40pm

Hello,Have you apoplied the CU1 update to your environment..If this still happens, then I would suggest opening a call with Microsoft. If all the pre-reqs have been met, It sounds like a bug.Cheers,MattMatt White ( http://systemcenterblog.hardac.co.uk/ )

Free Windows Admin Tool Kit Click here and download it now

March 17th, 2010 10:44pm

Yes I have applied CU1 to my environment. Yeah it does seem like a bug...Anyone from MSFT able to confirm possible bug here?Thanks for your help!

March 18th, 2010 10:01pm

are this VM's? if so, it might be due to scaling of the cpu. raising the minimum cpu power might solve it then.Rob Korving http://jama00.wordpress.com/

Free Windows Admin Tool Kit Click here and download it now

March 18th, 2010 11:05pm

Yes they are VM's. They all have a significant amount of CPU power in the VMs. One thought I had was, if the monitor was comparing the percentrage of usage compared to the rest of the server. Because the servers are doing so little (most of them are standby servers or low usage), the MonitoringHost.exe 1-2% usage is probabely larger than anything else on the box, could that cause this monitor to flag? Thanks.

March 21st, 2010 8:44pm

Vm's scale their cpu power. So if the cpu is like 100mhz and monitoringhost has to cpu time it's more likely to pass %usage time threshold than when you have 3000mhz. i've proven this for %interrupt time for vm's already, but i wouldn't be surpised if this happened here too. I'm sure you assigned a max, but did you assign a minimum cpu limit? if not raise it to like 400, that probably will do the trick. Rob Korving http://jama00.wordpress.com/

Free Windows Admin Tool Kit Click here and download it now

March 22nd, 2010 1:57am

Due to inactivity marked as Answered. Feel free to re-openCheers, Arie de Haan This posting is provide "AS IS" with no guarantees, warranties, rigths etc. Please remember to click Mark as Answer on the post that helps you, and to click Unmark as Answer if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread.

July 14th, 2010 11:50pm

We have exactly the same issue on Windows Server 2008 R2, anyone who can helps us?Certifications: MCSA 2003|MCSE 2003|MCTS| MCTIP:SA

Free Windows Admin Tool Kit Click here and download it now

October 18th, 2010 8:04pm

I'm seeing this as well on 2008 R2 servers. I've been collecting the CPU usage of monitoringhost.exe and it only spiked once to about 28% but it only stayed there about 15 minutes. I do not see the other times as the monitor shows. Has anyone gotten anywhere with this?

January 17th, 2011 3:25pm

I know how it works. Why is it happening on my 2008 R2 servers and how can I stop it?

Free Windows Admin Tool Kit Click here and download it now

January 18th, 2011 5:08pm

read the knowlegde... it's in there why it remains unhealthy for 3x 5 = 15 minutesRob Korving http://jama00.wordpress.com/

January 18th, 2011 5:13pm

This topic is archived. No further replies will be accepted.