Alright everyone,
I think I finally started figuring out what's happening with SCCM. At this time I have no reason to think this is a bug in SCOM, but purely a strange behavior in SCCM probably caused by some sort of bug in the product.
While I was dealing with the SCOM management pack, I started a side approach in tracking down the "props" parameter for each of the affected servers and after searching for an answer how to query the it I finally got this done via "wbemtest"
command. Once I started "wbemtest" I connected it to the "root\sms\site_s01" namespace (replace s01 with your own site). Then I hit the "query" button and typed in the following:
select * from sms_sci_sysresuse where itemname like '%distribution%'
I've chosen "distribution" in my case because that's where I have problems primarilly, so this listed out all DP and WDS servers in my environment. Once I double clicked on a particular item, from within the new window I scrolled down to "Props"
parameter and it indeed was set to "NULL" instead of "<array of embedded objects>" .
My next approach was to get this query done via PowerShell so I used the "Get-WmiObject" command and typed in the following in a bit more filtered way (replace "SCCMServer", "DPSERVER" and other parameters with your own):
Get-WmiObject -computername "SCCMServer" -Namespace "root\sms\site_s01" -query "Select * from SMS_SCI_SysResUse where SiteCode='S01' and NetworkOSPath like '%DPSERVER%'" |fl NetworkOSPath,RoleName,Props
(sorry my programming skills sux, but I hope you'll understand the approach)
This will list out the corresponding "Props" assignments for the individual RoleName based on a particular server you chose via the NetworkOSPath parameter.
Anyway I started wondering what may have happened so that the props parameters are not filled out and finally I made just a small modification to one of my WDS instances by ticking the "Specify an FQDN for this site system for use on the internet"
option in it's site role followed by un-ticking it again. Now this definitely engaged some modifications in the SCCM database because the next WMI query had shown that the "Props" parameter is properly filled with the "{Server Remote Name, Server
Remote Public Name, BITS download}" etc. strings which I guess is what SCOM is looking for.
By making this modifications to all my DP and WDS servers I was finally able to get SCOM querying the servers, however the adventures are not over yet. My WDS servers lost their props again and while I was investigating further the issue, I managed to figure
out that this happens exactly after I assign a non-SelfSigned certificate to them. The moment I import the certificates, the moment where "Props" parameter is reset back to "NULL".
So all in all SCOM is rightfully looking for a set of parameters in order to be able to identify the individual entities from a distributed application but SCCM is just resetting those parameters and preventing it from doing its job. I've tested the identification
with the unmodified Management Pack in SCOM and I confirm that it does work out of the box when "Props" parameters are not "NULL".
At the same time I'm still not comfortable with the fact that SCOM is not displaying anything in the "Display Name" column for my affected servers while it does for some other such as "WSUS" and "Reporting". The "Name"
column for all affected servers indicates "Microsoft.SystemCenter2012.ConfigurationManager.SiteSystemServer" regardless of whether or not the "Props" parameter is filled out properly or it indicates "NULL" but I guess this is
another thing to be troubleshooted. Frankly I'm not even sure what exactly I would need to see in those columns but pure logic says that I should see the server names just as it is with the "WSUS" instance for example.
Anyway... For me using SCCM with SelfSigned certificates is not an option and as I mentioned the external Certs are just reseting "Props" immediately which ultimately breaks SCOM monitoring for those instances.
Caution:
If you try my approach with modifying anything in the "Site System" role of a particular server, please keep in mind that it fully reset all my PXE related settings so I had to reconfigure those from scratch. I'm not sure why this have happened but
I would say it might be related to some sort of bug in SCCM or at least I don't think this was normal behavior. Also when I modified my DP servers the same way, they automatically switched from HTTPS based traffic to HTTP only. This just doesn't seem to be
right but I would appreciate if someone can confirm this behavior.
Any additional thoughts / responses from Microsoft Developers would be highly appreciated.
Also if anybody else has some additional ideas for toubleshooting further this whole matter, feel free to reply back.
FYI,
I'm testing this whole approach on a W2K12R2 based environment with both SCCM & SCOM patched to the latest available versions (SP1 R2 for SCCM and UR6 for SCOM). Also I'm using a Clustered SQL 2014 SP1 instance as a backend DB.
-
Proposed as answer by
RR Med
Wednesday, July 29, 2015 12:36 PM