agent connectivity problems
Having a heck of a time trying to resolve an agent connectivity problem. We're getting the error below on several agents machines and the machines go into a grey state in the console. What's especially troubling is that the machines go in and out of grey state. One minute they're green, the next they're grey and throwing the error below on the MS. "A device at IP 10.216.128.71:3813 attempted to connect but could not be authenticated, and was rejected" More Details: - These agents are in a trusted domain to the SCOM server domain via a forest trust. There is no MS in the remote domain, but the forest trust should allow for kerberos authentication. - We have dozens of other agents in this remote domain that are NOT experiencing this issue. - The agents are guest machines in a vmware environment (but we have many other vmware guest machines without this issue). - We've tried refocusing the troublesome agents on different DNS servers in an attempt to bypass any name resolution issues. (The same DNS servers that good agents are focused on). - We've tried removing the agents completely and focusing them on a different management server. - Whenever a server is in a grey state, all name resolution lookup tests are successful, both from the agent to the MS and from the MS to the Agent - including the RMS server. - We've tried hardcoding hosts. file entries in an attempt to bypass DNS altogether. - TCP 5723 Port access tests are successful - Netbios tests are successful in both directions. Several of the forum postings suggest a possible name resolution issue, but we've tried every possible dns scenario we can think of. We were thinking of trying to deploy a management server in this trusted domain and refocusing the agents on it to see if that helps, but would rather not have to do this. We're really at a loss - any help would be greatly appreciated.
October 12th, 2010 7:58pm

Hi Scott, Had you tried this fix? http://support.microsoft.com/kb/981263 Any other errors in event logs? On agent or MS?http://OpsMgr.ru/
Free Windows Admin Tool Kit Click here and download it now
October 12th, 2010 8:24pm

When the agents get into this state, they log the event log entry 20070: The OpsMgr Connector connected to ops-pr-mgt-02.CR.LOCAL, but the connection was closed immediately after authentication occurred. The most likely cause of this error is that the agent is not authorized to communicate with the server, or the server has not received configuration. Check the event log on the server for the presence of 20000 events, indicating that agents which are not approved are attempting to connect. Question on the hotfix - looks like it has to be installed on the MS's and agents. If I install it on the MS, will it automatically get deployed to agents that were pushed? It's not clear in the hotfix doc.
October 12th, 2010 9:09pm

> If I install it on the MS, will it automatically get deployed to agents that were pushed? No, it will not automatically deployed. It's not opsmgr hotfix. It's for ESE(operation system hotfix).http://OpsMgr.ru/
Free Windows Admin Tool Kit Click here and download it now
October 12th, 2010 9:14pm

Well I thought I had some initial success. I recreated the health service state folder as mentioned in the hotfix on all the management servers and the agents experiencing this problem. They all turned to green immediately after doing so, but then within 15 minutes or so I started receiving the connection errors again in the event logs on the MS's and agents and they randomly started dropping to grey again. Also installed the hotfix on all the MS's and on one of the agents with the problem, and it didn't fix the problem on that agent, so I didn't bother installing it on any other servers. (It requires a reboot, which isn't easily scheduled on all these agent machines). We've removed and re-installed these agents so many times, I'm wondering if there some type of orphaned agent reference in the database that's causing an issue???
October 13th, 2010 12:05am

Eek! Might be time to raise a ticket with MSFT on this one....Be very interested in the outcome. But b4 that, here's a stab in the dark (see bottom of post): http://www.systemcentercentral.com/tabid/60/indexId/50170/tag/Forums/Default.aspx Cheers, John Bradshaw
Free Windows Admin Tool Kit Click here and download it now
October 13th, 2010 7:32pm

Thanks John. I've used the hslockdown tool before, but like your reference indicates, it's more applicable to DC's and our machines are not DC's. I mentioned early on that these agent machines were virtualized on VMware. These servers have had connectivity problems before under heavy load conditions (they are SAP servers) and the VMWare administrator has worked quite exentsively with VMware to tweak the virtual NIC settings, as well as load new virtual NIC drivers. (in some cases beta drivers). So we are thinking it's related to these drivers and heavy load. The machines stay in a healthy green state at night, and then start dropping into grey states during normal business hours when utilization of the servers is at a peak. We're expecting new drivers from Vmware in a couple weeks, and hopefully that will address the issue. I will post a follow up when I have more information.
October 14th, 2010 11:23pm

Hello Scott, What is the latest condition? We’d love to hear your feedback about the solution. By sharing your experience you can help other community members facing similar problems. Thanks,Yog Li -- Please remember to click Mark as Answer on the post that helps you, and to click Unmark as Answer if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread.
Free Windows Admin Tool Kit Click here and download it now
October 19th, 2010 2:05pm

Hi, no activity for a long time. will mark as answer now. feel free to re-open. thanksAnders Bengtsson | Microsoft PFE | blog at http://www.contoso.se
December 26th, 2010 12:20pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics