SCOM 2007 R2 Recalculate health - View live data for a monitor, update monitors within health explorer
I have an issue with SCOM in that I can not figure out the best way to view live health for a monitor. I will use logical disk free space as an example. Say I get an alert that the c drive on a server has gone too low, and the alert was generated by Logical Disk Free Space for Windows Server 2003 Logical Disk. If I go into health explorer and look at state change events, I can see the alert, and it shows me the pctfree and mbfree, among other properties. Now lets say the pct free was 10 percent and mbfree was 1500 mb. Now over the next ten minutes, more files have been added to this drive, and I have yet to have the chance to look into this. By the time I log in to the server, I open My Computer and look at the disk, and see that pct free is down to 5 percent and mbfree is 750 mb. I now go to the SCOM console, and it still shows the last alert, stating that pct free is 10 percent and mbfree is 1500 mb. Recalculate health is doing nothing for me in this situation. I recalculate the health of the monitor, but SCOM still only shows the data from the alert, stating 10 percent free and 1500 mb free. I want SCOM to be able to actually update this value when I click on recalculate health, or else what is the point of even having this option? Other threads mentioned placing the server into maintenance mode, and then taking it out, as this will force it to recalculate all monitors. I want to find out if there is an alternative to this. The reason I don't like this, is that it resets the monitors for everything on the server, and I guess some monitors schedules cannot be edited easily, for example disk fragmentation levels. I read that it is hard coded to check this monitor once every 7 days. If I place the server into maintenance mode, it clears out the status for this monitor, and states: This monitor has been initialized for the first time or it has exited maintenance mode The way I understand this, is that the monitor is now set to success, and will not recalculate itself again until the schedule occurs, which in the case of some monitors, is at the discretion of the creator of the management pack. All in all, I want the most power over getting the most current data, and I do not want to have to use maintenance mode if I don't have to, for the reasons listed above as it will clear out the warning statuses for some of the monitors. I have been told about the Visio add in, which allows you to refresh current data, but I think this has to be an ability built into SCOM. If anyone can assist, or explain how they handle these issues, I would greatly appreciate it.
November 9th, 2010 4:27pm
There is a relationship between the monitor health and the underlying elements of the workflow. For instance, if the health of the monitor is recalculable (e.g. has a non-time-point oriented data source AND the author has marked the monitor as supporting recalculate, then it is possible via recalculate health to actually update based on the state of the underlying counter data. BUT - even with a recalculate, the text of the orinal alert will NOT be redone and a new alert will not be generated with new values in the alert fields. You may want to reset the monitor health and then choose recalculate health. This should (for counter data where it is possible) cause the workflow to evaluate its condition right away. If there is an alert from the old health state open, you will not see the new alert until you close the alert and reset the health state of the monitorMicrosoft Corporation
November 9th, 2010 10:28pm
Thank you so much for your reply Dan! What you said does make sense, the only thing I will add to it is that if the monitor is recalculable, you also need to be aware of the interval value for the monitor. In my test case, I tested against logical disk free space, which I had configured on an interval of 1800 seconds, or a half hour. Following your instructions, I reset the monitor, which in turn closed the alert. I then recalculated health, but it did not update. I then went to the overrides option for this particular disk, and I modified the interval value to 10 seconds. After about 30 seconds, when I recalculated health I actually got the updated value! While this is a little more complex then I would like it to be, it does provide me with what I am looking for. Thanks again for your help! Ralph Kyttle
November 11th, 2010 11:59am