Ping in Heartbeat

Hi,

I have a question regarding the SCOM (2012/2007) heartbeat mechanism.

From what I know, The SCOM Management Server gets "I'm alive" signal from the monitored agent.

If the MS didn't get the signal 4 times in a raw, it sends the "health service down" message and checks ping with the monitored agent. It will send a message accordingally(MSG if down, No MSG if up).

My question is: The one who sends the ping check - s the MS, right?

If so, I have another question regarding a specific architecture(I tried to upload a photo but couldn't because my account needs to be verified):

2 Management servers

3 GW(1-3) servers that communicates with MS1

X agents under GW1

What happens when?: Agent1(A1) health service is down and his manager is GW1. Who sends the ping to A1, GW1 or MS1?

The way I see it - the one who sends the ping to A1 is MS1. If so, can I change it so the one who sends the ping will be GW1?

Thanks,

Yakir.

February 18th, 2015 12:23pm

Hi Yakir,

This is the agent who send heartbeats to the MS. And only if the MS doesn't have news from a agent since a while, it initializes a ping test. Here is a schema - in french language sorry..but i think it's easy to understand :)

Julien

Free Windows Admin Tool Kit Click here and download it now
February 18th, 2015 1:17pm

Hi,

How Heartbeats Work in Operations Manager

https://technet.microsoft.com/en-us/library/hh212798.aspx

February 18th, 2015 1:34pm

Hi,

I know that,I've been in this page. But my question stays intact.

My question was regarding the situation with a gateway server in the middle. 

What happens then? Is the MS issues the ping or the GW that manages the agent?

And can I change it?

Yakir.




  • Edited by yakirLLC Wednesday, February 18, 2015 11:39 AM
Free Windows Admin Tool Kit Click here and download it now
February 18th, 2015 2:36pm

It is the MS that runs the ping diagnostic regardless of whether the agent is managed by an MS or GW server.

Specifically, it is the MS in the HealthServiceWatchersGroup that does this.

You can view this by looking at Discovered Inventory view,  target - Health Service Watcher Group or querying the OperationsManager database 'Select * from MTV_HealthServiceWatchersGroup'.

February 18th, 2015 5:39pm

Thanks for the reply.

I thought so...

So I can't change it, right? Because it's not nececerally  that my MS will have the ability to ping A1.

Yakir.

Free Windows Admin Tool Kit Click here and download it now
February 18th, 2015 5:45pm

Operations Manager uses heartbeats to monitor communication channels between an agent and the agents primary management server. As a result, GW1 will ping A1, whose primary managment server is GW1, when A1 health service is down.
Roger
  • Proposed as answer by Patrick_Seidl Wednesday, February 18, 2015 6:39 PM
February 18th, 2015 6:21pm

Thanks again for the reply.

So I got 2 different answers - one that says the MS will ping A1 and one that says the GW1 will ping A1.

Who should I go with? Can you provide me with some documentation regarding the matter?

I'm a bit confused.

Yakir.

Free Windows Admin Tool Kit Click here and download it now
February 18th, 2015 8:09pm

Hi,

microsoft hopeless guy is right. A1 communicates with the Gateway server GW1 and not directly with the Management Server MS1

February 18th, 2015 8:14pm

It is the MS that runs the ping diagnostic regardless of whether the agent is managed by an MS or GW server.


Eddie, that is not the correct information, see the others below.
Free Windows Admin Tool Kit Click here and download it now
February 18th, 2015 9:39pm

Hi,

So I have an answer. I've found a server (A2) which is managed by GW1 while:

1. MS1 can't ping A2

2. GW1 can ping A2

I got the "Failed to connect" message in the console about A2. 

That means that the one who pings to A2 is MS1 and not GW1.

I guess that I can't change it so GW1 will ping A2.

Hope I helped. Thanks a lot for your help guys! :)

Yakir.

February 19th, 2015 11:07am

For anyone who has managed multiple gateway environments the answer is pretty obvious.

Im surprised that certain people didn't know that , Patrick ;-)

Well you learn something every day....

Free Windows Admin Tool Kit Click here and download it now
February 19th, 2015 12:30pm

wooooot?! :-)

Can't imagine that I haven't seen that already. And I (my customers) do manage multiple GW-managed environments.

Will try, but I'm (nearly) convinced.

Yep... everyday something new. That makes it special ;-)

Sorry & thanks,
Patrick

February 19th, 2015 12:54pm

Update: following lab scenario: 3 MS, 1 GW; same domain, no firewall.

Agt1 managed by GW1 (Failover is MS2). Stopped Health Service, wait...

And I'm really surprised that Agt1 will be pinged by MS2 (failover MS) and not GW1 (primary MS).

:-(

Sorry, Eddie... and thanks for challenging!

/patrick

Free Windows Admin Tool Kit Click here and download it now
February 19th, 2015 5:04pm

Hi, This job (ping) is run by a member of All Management Servers Resource Pool (as no RMS now) via Diagnostic task Ping Computer on Heartbeat Failure targeted at Health Service Heartbeat Failure monitor.

Then

1. Should not add gateway server to All Management Servers Resource Pool

2. a member of All Management Servers Resource Pool executes WMI query to ping target server

http://blogs.technet.com/b/jonathanalmquist/archive/2010/01/11/health-service-heartbeat-failure-diagnostics-and-recoveries.aspx

3. You can override Source Server in Diagnostic task Ping Computer on Heartbeat Failure to any server but it will be a remote call from current management server which processes this job to specified server, so you should have DCOM ports opened and MS action account to have administrative right on a remote server. By default it's (.) that is localhost in order to avoid a remote call.

4. if you manually run Diagnostic task Ping Computer on Heartbeat Failure, this job even can be run by gateway server (this can be seen by activating logging in Diagnostic task) despite the fact it's not in All Management Servers Resource Pool.

5. the location of RMS emulator is not taken into account

February 20th, 2015 5:29pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics