Subscription Stop Working When OpsDB Cluster Is Failed Over
Hello again. Several times now we have had an issue where after the OpsDB cluster is failed over, subscriptions will not work until the RMS cluster is also failed over.It just happened again yesterday following patching. The OpsDB cluster was failed over and back, and console access was fine - alerts were coming, going, reports could be run, etc. However, NO subscriptions were working: smtp, command channel.Restarting the Health Service on the RMS did not help, and then I remembered that I'd had the issue before and I failed the RMS cluster to the other node. Sure enough, within a few minutes of doing that, all the notifications that were supposed to have been sent during the day fired off.There are no errors/warnings in the event log on the RMS. There is no indication that subscriptions aren't working - other than people complaining about not getting emails and alerts not being forwarded to TEC (kind of important).I suppose a fail over is not necessary, maybe just a restart of the SDK service would work, but I would like to know if anyone else has seen this, or can duplicate it? Why would this happen when all other console functions seem to be working following a fail over?Thank you, Layne
March 17th, 2010 12:26pm

I have seen this many times. When maintenance is performed on the OpsDB, including a cluster failover, - the creates a (however small) outage for SQL. Most of the time - the RMS recovers and this isnt an issue. However, I have seen many time where the RMS services need to be bounced (in a cluster, offline/online) in order to re-establish full connectivity to all workflows. When you plan maintenance for the OpsDB, you should also plan service bouncing of the dependent applications that depend on the DB.
Free Windows Admin Tool Kit Click here and download it now
April 7th, 2010 12:54am

I have seen this many times. When maintenance is performed on the OpsDB, including a cluster failover, - the creates a (however small) outage for SQL. Most of the time - the RMS recovers and this isnt an issue. However, I have seen many time where the RMS services need to be bounced (in a cluster, offline/online) in order to re-establish full connectivity to all workflows. When you plan maintenance for the OpsDB, you should also plan service bouncing of the dependent applications that depend on the DB.
April 7th, 2010 12:54am

Agreed. Unfortunately it doesn't help if the cluster happens to fail over due to an unplanned incident, but at least I now know of a work-around if/when the subscriptions stop working. Thanks.Layne
Free Windows Admin Tool Kit Click here and download it now
April 7th, 2010 11:17am

I have seen this many times. When maintenance is performed on the OpsDB, including a cluster failover, - the creates a (however small) outage for SQL. Most of the time - the RMS recovers and this isnt an issue. However, I have seen many time where the RMS services need to be bounced (in a cluster, offline/online) in order to re-establish full connectivity to all workflows. When you plan maintenance for the OpsDB, you should also plan service bouncing of the dependent applications that depend on the DB. Does anyone know if this issue has been resolved in SCOM 2012?
July 24th, 2012 9:27am

It has. Just as it was for OM 2007 R2 with CU4. http://blogs.technet.com/b/kevinholman/archive/2011/02/07/a-new-feature-in-r2-cu4-reconnecting-to-sql-server-after-a-sql-outage.aspx You must create the registry settings and values on either version, to change the default behavior of connecting/reconnecting to the database.Kevin Holman http://blogs.technet.com/b/kevinholman
Free Windows Admin Tool Kit Click here and download it now
July 24th, 2012 9:32am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics