Monitor All Windows Services on multiple Servers and Recover
I am looking for a way to be able to create a monitor that will monitor All Servers for a Stopped Service set to Automatic. I would also like for it to be able to Re-Start the Stopped Service Automatically. Surely SCOM can do this, as we are moving to SCOM 2007 from Argent Guardian, and Argent has this ability. I know about the Basic Service Monitor, but this would require me to create 100 monitors which is ludacris. I did have a script that worked in SCOM 2007 SP1 that MONITORED ALL SERVICES on ALL SERVERS but was not able to include the Auto-Recovery needs. The script is below, IT DOES NOT WORK IN R2 for Unknown Reasons... Any help would be greatly appreciated!
OLD SCRIPT (No Longer working in R2):
' Monitor if automatic process is not running
Dim oAPI, oBag
Set oAPI = CreateObject("MOM.ScriptAPI")
Set oBag = oAPI.CreatePropertyBag()
Set objFSO = CreateObject("Scripting.FileSystemObject")
strComputer = "."
Set objWMIService = GetObject("winmgmts:" _
& "{impersonationLevel=impersonate}!\\" & strComputer & "\root
\cimv2")
Set colListOfServices = objWMIService.ExecQuery _
("Select * from Win32_Service Where StartMode = 'Auto'")
For Each objService in colListOfServices
If Not objService.Started = True Then
strDesc = "Service " & objService.Caption & " is not running!"
Call oBag.AddValue("Description", strDesc)
Call oBag.AddValue("State", "BAD")
Else
Call oBag.AddValue("State", "GOOD")
End If
September 8th, 2009 8:32pm
Hi,
Have you seen the management pack
template for
service monitoring? In the top of the navigation tree in the authoring
workspace? If you create service monitoring with that one it will
automatic
discover all machines running the server and start monitor them. But of
course,
if you want to monitor all services on all machines and restart them
too, a
script might be easier.
Your script, does it work outside of
Ops
Mgr?
--
Anders
Bengtsson
Microsoft MVP -
Operations Manager
www.contoso.se
September 9th, 2009 8:02am
Im really looking for a set it forget it type scripte resolution that will monitor all services on all machines and restart them. This script took care of the monitor all services on all machines part before upgrading to R2. As for it working outside of OpsMgr I have no idea how to test that. I didnt write this script, I was given it from another source.
-Dan
September 9th, 2009 12:59pm
Hi Dan,
in terms of testing the script, copy the script to a server, call it test.vbs, then from command prompt type: cscript c:\test.vbs (or wherever you saved it).
If there's no syntactical errors, it should run, and also in your instance provide some propertybag output - courtesy of this bit:
Call oBag.AddValue("Description", strDesc)
Call oBag.AddValue("State", "BAD")
Else
Call oBag.AddValue("State", "GOOD")
If you want to post the script up here in full, i'll test it for you.
You're right, it is a nuisance there isn't a catchall "Automatic" service check, but sure we can get yours working.
Cheers
September 9th, 2009 1:17pm
This might be possible. I would look at the event path instead of a script path. The service control manager will toss out a windows NT event in the application log when a service unexpectedly stops. That event _may_ have sufficient parameter output (service technical name is what you would need in an event parm) - such that you can create an alert from the event, and pick up the name of the service to restart from an event parm.
To check this, use the crimson (new event viewer) event viewer, and look at the logs and then try to recreate the event by killing a service in task manager. Some are tricky because of the service host process, those probably aren't going to respond well unless you can get the service technical name to pass to a net-start command.
You can see if the event is parameterized by looking at the XML tab in the event viewer. There is a parameters block that should have some data in it.
If you see the service technical name, you can author a rule that captures this in the alert context, and from there link a task that restarts a service.
The last trick will be linking your monitor to the appropriate class. you could use Microsoft.Windows.Server.Computer - since this will get placed on all of your agent managed servers. Once you have the class figured out (and any discovery you need) then just add a task to your custom MP that targets the same class. Make the task have a parameter that is overrideable that picks up the alert context data, and then references the appropriate event parm to pass to a net start command.
September 9th, 2009 3:56pm
Below is the script pulled directly from the monitor on SCOM 2007 SP1 when it was working...
' Monitor if automatic process is not running
Dim oAPI, oBag
Set oAPI = CreateObject("MOM.ScriptAPI")
Set oBag = oAPI.CreatePropertyBag()
Set objFSO = CreateObject("Scripting.FileSystemObject")
strComputer = "."
Set objWMIService = GetObject("winmgmts:" _
& "{impersonationLevel=impersonate}!\\" & strComputer & "\root
\cimv2")
Set colListOfServices = objWMIService.ExecQuery _
("Select * from Win32_Service Where StartMode = 'Auto'")
For Each objService in colListOfServices
If Not objService.Started = True Then
strDesc = "Service " & objService.Caption & " is not running!"
Call oBag.AddValue("Description", strDesc)
Call oBag.AddValue("State", "BAD")
Else
Call oBag.AddValue("State", "GOOD")
End If
Next
September 9th, 2009 9:07pm
Try this:
Dim oAPI, oBag
Set oAPI = CreateObject("MOM.ScriptAPI")
Set oBag = oAPI.CreatePropertyBag()
strComputer = "."
Set objWMIService = GetObject("winmgmts:" _
& "{impersonationLevel=impersonate}!\\" & strComputer & "\root\cimv2")
Set colListOfServices = objWMIService.ExecQuery _
("SELECT * FROM Win32_Service WHERE StartMode = 'Auto' AND Started = false")
If colListOfServices.Count > 0 Then
For Each objService in colListOfServices
If strDesc = "" Then
strDesc = objService.Caption
Else
StrDesc = StrDesc & ", " & objService.Caption
End if
Next
Call oBag.AddValue("Description", strDesc)
Call oBag.AddValue("State", "BAD")
Else
Call oBag.AddValue("State", "OK")
End if
Call oAPI.Return(oBag)
September 10th, 2009 3:31am
That worked GREAT!!! Thank You!!!
Would there be a way to add on to this script the ability to "exclude" or "ignore" certain Services. i.e. we have a service that resides on some our services that is set to Auto but does not constantly run. Is there a way we could have the script ignore this servie and not throw out an Alert?
Also (I know Im asking a lot and I really appreciate it) how hard would it be to add the ability to this script to take the "Stopped" Service and run a Net Start against it?
-Dan
September 10th, 2009 1:25pm
>Would there be a way to add on to this script the ability to "exclude" or "ignore" certain Services.
Yes.
strComputer = "."
Set objWMIService = GetObject("winmgmts:" _
& "{impersonationLevel=impersonate}!\\" & strComputer & "\root\cimv2")
Set colListOfServices = objWMIService.ExecQuery _
("SELECT * FROM Win32_Service WHERE StartMode = 'Auto' AND Started = false")
If colListOfServices.Count > 0 Then
For Each objService in colListOfServices
If objService.Caption <> "Performance Logs and Alerts" AND objService.Caption <> "TypeYourService2NameHere" Then
If strDesc = "" Then
strDesc = objService.Caption
Else
StrDesc = StrDesc & ", " & objService.Caption
End if
End if
Next
If strDesc <> "" Then
Call oBag.AddValue("Description", strDesc)
Call oBag.AddValue("State", "BAD")
Else
Call oBag.AddValue("State", "OK")
End if
Else
Call oBag.AddValue("State", "OK")
End if
Call oAPI.Return(oBag)
>Also (I know Im asking a lot and I really appreciate it) how hard would it be to add the ability to this script to take the "Stopped" Service and run a Net Start against it?
You can use the objService.StartService()
For example:
If colListOfServices.Count > 0 Then
For Each objService in colListOfServices
If objService.Caption <> "Performance Logs and Alerts" AND objService.Caption <> "TypeYourService2NameHere" Then
If strDesc = "" Then
strDesc = objService.Caption
Else
StrDesc = StrDesc & ", " & objService.Caption
End if
objService.StartService()
End if
September 11th, 2009 2:55am
BTW,
I do not recommend to use objService.StartService() in this script because it will change the state (start service) but alerts you that you have a stopped services. You should write a new script and run it as recovery task.
September 11th, 2009 4:25am
Works great! Thanks! I did seperate them and populated the start service in recovery...works like a charm...thanks again!
September 16th, 2009 6:46pm
Alexey,
This script works well but I cannot get the alerts to show me the services that fail.
Do you know what parameters I need to configure on the "Alert Properties" to get the alert to show me the services that failed?
Thank you
December 28th, 2009 9:17pm
$Data/Property[@Name='Description']$
December 29th, 2009 4:14am
Thank you Alexey that was exactly what I needed.
I am trying to understand how to pass the failed services to the recovery script.
Can you send me the complete vbscript that I could run for the recovery of the services?
Thank You Again.
December 29th, 2009 10:43pm
You can use the same script in recovery:
strComputer = "."
Set objWMIService = GetObject("winmgmts:" _
& "{impersonationLevel=impersonate}!\\" & strComputer & "\root\cimv2")
Set colListOfServices = objWMIService.ExecQuery _
("SELECT * FROM Win32_Service WHERE StartMode = 'Auto' AND Started = false")
For Each objService in colListOfServices
If objService.Caption <> "Performance Logs and Alerts" AND objService.Caption <> "TypeYourService2NameHere" Then
ObjService.StartService()
End if
Next
December 31st, 2009 7:03am
Thank you very much Alexey. That worked perfectly.
January 6th, 2010 8:43pm
This posting is really useful, thanks guys. I have a question about it though.
I have implemented this unit monitor in our environment. It works perfectly and starts any stopped automatic services it finds. But it only works once. I have set it to run every 5 minutes, set the recovery to recalcuate the monitor state when it finishes. But it only ever appears to run once.
If I delete and recreate the unit monitor, it works as soon as the recreated monitor is enabled, but then never works again.
March 16th, 2010 4:39pm
Also, for the alert description I'd like to include the agent name (I've managed to get it to return the service name using $Data/Property[@Name='Description']$)
I'm using SCOM 2007 R2 and I'm attempting to use -
$Target/Property[Type="Windows!Microsoft.Windows.Computer"]/DNSName$
But it errors out. Does anyone know what I should be using?
March 16th, 2010 4:48pm
Hi Kaiser1,
I was looking for the same solution that you found here (how to monitor all services on all servers and how to restart them automatically if they stop).
Although I have read questions and answers on this page, I am not sure how to perform all specific steps necessary to implement these solutions (starting with creating a monitor including a recovery task and then creating all the necessary scripts, and how and where to apply them).
Now when you deployed everything you need to get this things work, could you please send me a detailed step-by-step instructions on how to do each phase of the process?
Thank you in advance.
March 16th, 2010 8:46pm
Hi Kaiser, Dan, Alexey, Sanitross, et all
I am a newbie to SCOM 2007 R2 and am aslo looking for "detailed step-by-step instructions on how to setup this particular script to monitor all Windows Server Services (automatic) and alert when they are stopped."
Can someone please help?
Thanks alot,
OB
October 19th, 2010 11:59am
October 19th, 2010 12:21pm
Alexey,
Thanks for the quick reply. I went to that link it brought me to "How to create a Script-based 2-State Monitor."
I looked a that and it shows how to make a basic monitor. Do you or anyone else have a complete script that I can use to
Monitor All Windows Services on a Server (set to Automatic) that have stopped.
I know its alot to ask, just new to this IT thing and trying to learn and need much help I guess.
Thanks,
OB
October 19th, 2010 3:24pm
Complete script is posted here above in thread - Friday, September 11, 2009 2:55 AM.
October 19th, 2010 3:45pm
Thanks Alexey,
It looks like I am having the same issue as poster Saintross above.
It works, But it only works once. I have set it to run every 5 minutes, set the recovery to recalcuate the monitor state when it finishes. But it only appears to run once.
If I delete and recreate the unit monitor, it works as soon as the recreated monitor is enabled, but then never works again.
Any ideas on how to keep this script running more than once?
Thanks
OB
October 19th, 2010 7:35pm
>but then never works again.
How many such services do you have(stopped and starttype=Auto)?
>set the recovery to recalcuate the monitor state when it finishes.
If I recall correct then script-based monitors created in UI does not contain any on-demand modules. If so - recalculation will not
October 20th, 2010 8:15am
The monitor only working the first time fooled me for ages but then I remembered about the monitor state. Open the Health Explorer for the computer and under 'availability' is the 2-state monitor.
I found that the monitor switches to 'warning' state on first run and doesn't generate any further alerts until the state is changed back to normal. To do this, either start the service that generated the initial alert or select 'reset health' in the
Health Explorer.
Hope this stops someone spending all day on this puzzle like I did. On to the next issue.....
February 23rd, 2011 5:15pm
Ok Can we flip this around so I can give you a list of services that I want checked on a given server and IF one is not up... it will notifiy and continue down the list...
There might be two that are down and both should generate a seperate alert .....
May 17th, 2011 8:19pm
Hi,
This script (and a monitor) was created a long time ago, when OpsMgr 2007 R2 (and it's service monitoring template) wasn't released. Why don't you use a service monitor template?
>There might be two that are down and both should generate a seperate alert .....
You must use more advanced techniques for that. Like a custom class creation and discovery.
May 18th, 2011 3:44am
Alexey,
Is there a list of the parameters i can use for services set to automatic.
I am trying to manipulate the Service or Driver failed alert to only key off of automatic services.
I found an old document for 2005 (http://technet.microsoft.com/en-us/library/cc180325.aspx) saying to use Paramter 10 equal to 2.
Now the source is saying MOM but what if the alert is from the app log? will this parameter still work...also how do i find out these on my own? what is the best way for me to figure these out in the future?
December 23rd, 2011 6:42pm
Hi,
>I am trying to manipulate the Service or Driver failed alert to only key off of automatic services.
Can you explain in more details what do you want to achieve exactly? If you want to 'tune'
this rule to 'disable' an alerts from non-autostarted services I am do not believe it's possible with this rule. MOM 2005 link is not relevant here, this rule works another
way.
If you want to disable this rule and write your own, you could start with analysing the events that this rule use for an alert generation (you can find them in the knowledge base article for this rule). Again, I do not believe you can do it using
one rule because a many of this events doesn't contain any info on a service start type.
>what is the best way for me to figure these out in the future?
hmm... the best way (IMO) is to learn how to develop a management packs (at least - how to unseal, read and understand how the particular rule\monitor\etc work)...
December 26th, 2011 5:11am
Hi i have uses this script in recovery but the services are not started.
March 26th, 2015 1:27am