RMS and DB server maintenance
Our RMS and DB are on separate clustered servers. The cluster team wants to implement some hardware changes on the SAN shared storage that would necessitate the servers being down 4-8 hours. They want to schedule separate implementation windows during weekends, which I've nixed for now while I "evaluate". I wanted to know what the implications are for these components being down for so long, and any workarounds I should consider. Should I recommend doing both at once to reduce the overall down time or will that make things worse? SCOM being such a black box - meaning how it actually works is never or rarely explained in full detail - it's not easy to work out all the implications. While the RMS is down: * the SCOM console will be unavailable * management servers will still insert data into DB and alerts will get generated. * no notifications will go out since this is an RMS function. Will notifications for missed alerts go out once the RMS is back online? I would think not since they're not new. While the DB is down: * the SCOM console is unavailable * management servers will not insert data into DB and no alerts will get generated. However data will queue up and get inserted when the DB is available. How much is queued and for long? Is there any way I can prioritize the queues, e.g. queue alerts but drop events and perf data? * no notifications will go out since there will be no new alerts inserted into the DB Anything else I'm missing or any inaccuracies? Since the RMS is clustered and MS are not I can't promote MS to RMS. I won't be able to push this off forever unless its "OMG don't do that", and really I don't want to. I just want to understand all the implications and see if there's anything more I can do other than to back everything up and be prepared to restore\rebuild.
April 24th, 2012 2:36pm

You got it pretty much. Once you get back online you could get a "storm" of queued data, combined with config update etc. it could make the console really slow for a few hours. not much you can do about the queues, all the prioritizing is done automatically. You could set all agent managed servers into maintenance so all workflows stop before you take the rms and db offline.Rob Korving http://jama00.wordpress.com/
Free Windows Admin Tool Kit Click here and download it now
April 24th, 2012 6:29pm

Also you can consider the following components will not function if the RMS is unavailable: - All consoles (Operations , Web, Shell) - 3rd party connectors Regards, Mazen Ahmed
April 24th, 2012 9:23pm

Additional consideration. We will be bringing down the passive then the active node. Is there any reason to stop the SCOM services first? I don't see any point really.
Free Windows Admin Tool Kit Click here and download it now
May 10th, 2012 10:59am

So it went well. I just paused both nodes prior to shutting down. 6 hours down and everything came back up with no issues.
May 14th, 2012 1:44pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics