SCOM 2007 R2 Agents all showing not monitored
I'm brand new to SCOM, and I can't get monitoring started on any of my agents. I'm using SCOM 2007 R2. I've been able to discover agents and push an install. I've also been able to do a manual install and approve it. My RMS shows as being
monitored and healthy, but all my other agents show as not monitored. When I open the "State View" for a computer, and then try to run any health task, I get this error "Health Service 'Hostname' in which the location monitoring object
is contained is not available. Make sure that the computer hosting the Health Service is available and verify that the Health Service is running."
I've logged into the computer I was trying to monitor and verified that the health service is running (running as local system account).
What do I need to do to start monitoring?
June 15th, 2010 8:21pm
Hi
This might be an SPN issue - do you have any alerts in the console about SPN errors? This will affect authentication:
http://blogs.technet.com/b/kevinholman/archive/2007/12/13/system-center-operations-manager-sdk-service-failed-to-register-an-spn.aspx
On one of the agents that is not monitored - take a look at the operationsmanager event log on that server and see what the errors are. The fact that it is every agent suggests a configuration issue rather than an individual agent problem ... other things
to look for are kerberos errors in the system log on the RMS.
Good Luck
GrahamView OpsMgr tips and tricks at
http://systemcentersolutions.wordpress.com/
Free Windows Admin Tool Kit Click here and download it now
June 15th, 2010 8:45pm
No alerts in the console at all
June 15th, 2010 9:09pm
What errors are in the operations manager event log on the servers \ agents that are listed as not monitored?
View OpsMgr tips and tricks at
http://systemcentersolutions.wordpress.com/
Free Windows Admin Tool Kit Click here and download it now
June 15th, 2010 9:26pm
Firewall issue ? is the ports 5723 reacheable ? Try a telnet connection : "mycomputer.mydomain.local 5723"
http://social.technet.microsoft.com/Forums/en/systemcenterrom/thread/11e3bd77-f04d-41cc-a5c5-a18cd617baaeChristopher Keyaert - My OpsMgr/SCOM blog : http://www.vnext.be
June 15th, 2010 9:32pm
In the event log on the RMS, there are several errors similar to this:
"Rule/Monitor "Microsoft.Windows.Client.Vista.ComputerGroup.DiskTrendsComputer" running for instance "Microsoft System Center Data Warehouse" with id:"{16781F33-F72D-033C-1DF4-65A2AFF32CA3}" cannot be initialized and will
not be loaded. Management group "ServersGroup"
On the client, I see these two errors:
"The OpsMgr Connector connected to Jupiter.hw.local, but the connection was closed immediately after authentication occurred. The most likely cause of this error is that the agent is not authorized to communicate with the server, or the server
has not received configuration. Check the event log on the server for the presence of 20000 events, indicating that agents which are not approved are attempting to connect."
"OpsMgr was unable to set up a communications channel to Jupiter.hw.local and there are no failover hosts. Communication will resume when Jupiter.hw.local is available and communication from this computer is allowed."
Free Windows Admin Tool Kit Click here and download it now
June 15th, 2010 9:33pm
Did you check in the Administration pane > Pending Management to see if you don't need to approve the agent installation ?Christopher Keyaert - My OpsMgr/SCOM blog : http://www.vnext.be
June 15th, 2010 9:42pm
Yes, everything is approved (nothing under pending management at all)
Free Windows Admin Tool Kit Click here and download it now
June 15th, 2010 9:59pm
I have the firewall disabled on the client, but when I try and telnet to it (telnet <clientname> 5723) I get "Could not open connection to the host, on port 5723: Connection Failed"
June 15th, 2010 10:06pm
UPDATE: I have disabled the firewall, I can telnet to the client on just about any port except 5723
I ran netstat and don't see anything using that port
I stopped the health service on the client and tried telneting to it again, but still no luck
CORRECTION: After restarting the health service, and then running NETSTAT on the client, there was an active connection to the RMS on port 5723. The agent is still showing is not monitored, with unknown version and unknown action account.
Free Windows Admin Tool Kit Click here and download it now
June 15th, 2010 10:21pm
Do you still have the same error in the event viewer ?Christopher Keyaert - My OpsMgr/SCOM blog : http://www.vnext.be
June 15th, 2010 10:41pm
On the client, there are no errors.
On the RMS, there are 28 errors like this one, all appearing at the same time: "Rule/Monitor "Microsoft.SystemCenter.DataWarehouse.CollectPerformanceData" running for instance "<RMS_SERVER_NAME>" with id:"{86751F4E-A9D3-A4A3-852F-56678C6C8B4F}"
cannot be initialized and will not be loaded. Management group "ServersGroup""
Also, on the RMS, there are these errors:
"Local health service is not healthy. Alert flow is stalled with pending acknowledgement.
Management Group: ServersGroup
Management Group ID: addee9e3-97b7-c533-1118-4fb8ef036ad1"
"Local health service is not healthy. Entity state change flow is stalled with pending acknowledgement.
Management Group: ServersGroup
Management Group ID: addee9e3-97b7-c533-1118-4fb8ef036ad1"
"The health service {F84DB47E-5454-9F56-F6A7-C08AA0A43CFE} running on host <CLIENT FQDN> and serving management group ServersGroup with id {ADDEE9E3-97B7-C533-1118-4FB8EF036AD1} is not heartbeating."
Free Windows Admin Tool Kit Click here and download it now
June 15th, 2010 10:48pm
Regarding the errors, please try the following methods:
OpsMgr 2007: The Health of the Root Management Server is in a Gray “Not Monitored” State
http://blogs.technet.com/b/smsandmom/archive/2008/08/28/opsmgr-2007-the-health-of-the-root-management-server-is-in-a-gray-not-monitored-state.aspx
A computer agent unexpectedly generates heartbeat alerts after you put it into Maintenance mode in System Center Operations Manager 2007
http://support.microsoft.com/kb/942866
Health Service Heartbeat Failure, Diagnostics and Recoveries
http://blogs.technet.com/b/jonathanalmquist/archive/2010/01/11/health-service-heartbeat-failure-diagnostics-and-recoveries.aspx
Meanwhile, I would like to share the following with you for your reference:
Agent discovery and push troubleshooting in OpsMgr 2007
http://blogs.technet.com/b/kevinholman/archive/2007/12/12/agent-discovery-and-push-troubleshooting-in-opsmgr-2007.aspx
Getting headaches trying to figure out why you are seeing the 'Not Monitored' state for Management Servers or Agents?
http://blogs.technet.com/b/momteam/archive/2008/03/10/getting-headaches-trying-to-figure-out-why-you-are-seeing-the-not-monitored-state-for-management-servers-or-agents.aspx
Hope they are helpful.
Thanks.
Nicholas Li - MSFT
June 16th, 2010 12:01pm
Your core issue is that your RMS is not healthy. This isnt related to a client access issue. This is related to something wrong with the RMS.
Your health service is not healthy.
Start by doing this:
Stop the RMS Health service (system center management) and then rename the \Health Service State folder to \Health Service State.OLD
Start Health service on RMS.
This will force RMS to generate new config. Look in the new \Health Service State\Connector Configuration Cache\<MGNAME>\ directory for the config XML file. A new one should be generated within a few minutes. (longer on very large management
groups)
If the file does not get created - you have an issue with your RMS not generating new config.
If the file does get created - then go back to the event log on the RMS, after the HS restart, and filter on all warning, critical, and error events. You should see a clue as to whats going on.
Free Windows Admin Tool Kit Click here and download it now
June 16th, 2010 10:00pm
Kevin,
thanksthislooks like it solve mine problem fo mine ms server
Start by doing this:
Stop the RMS Health service (system center management) and then rename the \Health Service State folder to \Health Service State.OLD
Start Health service on RMS.
This will force RMS to generate new config. Look in the new \Health Service State\Connector Configuration Cache\<MGNAME>\ directory for the config XML file. A new one should be generated within a few minutes. (longer on very large management
groups)
July 12th, 2010 1:28pm
Please check the connectivity. what kind of logs are coming on your agent machine will you please paste that logs and the logs which are coming on Rms from Scom event viwer.Please past the hole description of the error event which is coming again and again.Omkar umarani SCOM STUDENT
Free Windows Admin Tool Kit Click here and download it now
November 22nd, 2010 6:54pm