How to deal with multiple SUP ?

Hello guys !  I know that the question in the "title" of the thread is a bit vague, so let me explain my issue :)

First of all grab a pencil and a sheet of paper, I am gonna introduce some concepts of our infrastructure which are necessary in order to understand my problem.

So let's say I have 4 different network zones (1, 2, 3, 4) where different windows domains are hosted (A, B, C, etc.).

Without giving too much information about our infrastructure, the final drawing is as follow :

Primary site (with SUP) in Network zone 1 Domain A

And I have

1 MP/DP/SUP in the network zone 2, Domain A, managing untrusted domains in zone 2

1 MP/DP/SUP in the network zone 3, Domain A, managing untrusted domains in zone 3

1 MP/DP/SUP in the network zone 4, Domain A, managing untrusted domains in zone 4

So in abstract, all SUPs are in the same windows domain sharing the same DB managing different untrusted windows domains.

Everyone still here ? :)

So, all SUP roles installed, all my clients in the untrusted domains receives the 4 SUPs (locationservice.log). I've read a lot of documentation and topics and I understood that since 2012 SP1 (My version is 2012R2 CU3 if my memory is still fine), SCCM supports multiple SUPs.

My problem in fact is that, for example, a client in a domain in network zone 2, takes the SUP of the network zone 1 which is not allowed regarding our security policy, it should take the SUP from the same network zone (alias 2). But as I understood, that shouldn't be a big deal since after 4 unsuccessful attempts  at the interval of 30 minutes between the intervals (in other words 2 hours) it should roll the SUPs and try to connect to another one, excepted that after few days, it doesn't and the client is still trying to reach my primary site in network zone 1 ...

Then I came across the following article which describe exactly my issue : http://blogs.technet.com/b/umairkhan/archive/2014/10/03/configmgr-2012-r2-multiple-sup-scenario-clients-not-failing-over-to-the-other-sup.aspx

Error 0x80072ee2 in WindowsUpdate.log etc.

So I applied the workaround by adding the error code to the "WSUS Scan Retry Error Codes" but unfortunately it doesn't do the trick ... And my client continues to try to contact the primary site and not the SUP he's supposed to.

Is it clear enough ?

So my questions are quite simple ...

1. Am I doing it right ?

2. Is there a way to force the SUP through a registry hack such as for the MP (AllowedMPs) ?

Any other suggestion is welcome !

Thank you :)

May 28th, 2015 6:27am

You've got everything above accurate to my knowledge. The problem is that it's simply not a by design scenario.

Are you sure that the clients are receiving the error code that you've added to the error code list when they try to access an incorrect t WSUS instance?

Have you verified that the clients actually get the update error code list in their policy?

Can you manipulate DNS? If so, using a bogus CName record to force an error code in the list should work. Alternatively, a CName redirecting clients to the proper WSUS instance should also work. Of course, you need to be able to control DNS separately for each of your zones to be able to make either of these work.

Free Windows Admin Tool Kit Click here and download it now
May 28th, 2015 9:08am

Hello Jason,

How can I be sure of the error code they receive ? Having a look at the WindowsUpdate.log I can find a 0x80072ee2 at almost every line, is there another log file I should look into ?

I just re-perform the "To verify the codes on the Client Machine" section documented in the previous link, and indeed I can see the 2147954402 error code, and this from my client. 

I have full access rights in the DNS as well, but that won't work since the SUPs are in another domain and we use DNS forwarders, if I do so I'll impact all servers, including the Primary Site ... What do you mean by "using a bogus CName record to force an error code in the list should work." How could I push an error code via the DNS ?

Since I am performing a test from a specific server, I'll just add few more servers to the collection to see if they encounter the issue as well. I'll tell you tomorrow if the newly added servers roll up their SUP or not...

In the mean time any other idea is welcome !

May 28th, 2015 9:28am

LocationServices.log should give details of what ConfigMgr sees when choosing a SUP.

If the value returned from DNS is bogus, it would cause a different error code instead of "operation timed out" -- it'll return a DNS failure code instead. It's DNS sending an error code but the DNS lookup causing the failure and thus a different error code. You could also do this using a local hosts file, but that would be painful.

Free Windows Admin Tool Kit Click here and download it now
May 28th, 2015 9:39am

Jason,

Got it, but since we forward DNS requests to the responsible domain, the DNS records will be bogus for the whole infrastructure, including monitoring, backup and my primary site, this isn't a solution.

Host file, that's an idea, but I don't want to maintain hostfiles over 5k machines even though it could be done through baselines but that solution isn't really sexy to me ...

I'll come back tomorrow with a status for the freshly added computers to the collection.

Thanks for your time !

May 28th, 2015 10:39am

My "Fast suggestion" to your problem as I understand it.

If I read things correctly, your various machines are in untrusted domains and only have rout-able access to specific SUP boxes but default they are randomly picking from the "pool" in the site.

My answer reading this would be to setup FQDN public addresses for these boxes ... set them all up for the exact same external FQDN (like you would if you wanted to put them behind a load balancer).  In each network, simply set in DNS the singular IP for the FQDN that the client should be able to resolve and connect to.  Finally, if not already, set your clients to be in Internet-only mode on the guests (means introducing PKI, but with the kind of security you're playing with you probably have that kind of infrastructure in place).

That should give a simple, singular public FQDN connection point that resolves to the appropriate unique IP in each network security zone. 

Sadly if you were running Windows vNext (not out yet dangit!) you could have used conditions on your DNS and done it a bit more cleanly that way, but that doesn't mean we will have a problem if you use a singular DNS topology for all your network regions, just make said FQDN it's own subzone which is unique for each network (don't replicate it) so the rest can stay shared.

Basically every domain will have a subzone called "sup.contoso.com" with a root A record of whatever.  You ONLY include that FQDN as it's own subzone .... let the parent contoso.com continue to be forwarded appropriately.  That way your DNS stays mostly clean, and you ONLY worry about the one FQDN of contention.

Do that, and now you have a simple, common name for all your SUPs that your clients can fall back onto regardless of security zone.


Free Windows Admin Tool Kit Click here and download it now
May 28th, 2015 2:17pm

Jason,

Got it, but since we forward DNS requests to the responsible domain, the DNS records will be bogus for the whole infrastructure, including monitoring, backup and my primary site, this isn't a solution.


You can still do both.  forward contoso.com to the responsible domain ... but make a stand-alone zone called sup.contoso.com (or whatever your FQDN is) and just drop in the appropriate A root record in.  That's simply a unique zone per DNS topology ... not quite as nice as what we'll be able to do in vNext, but way better than 5k host files.

May 28th, 2015 2:24pm

My "Fast suggestion" to your problem as I understand it.

If I read things correctly, your various machines are in untrusted domains and only have rout-able access to specific SUP boxes but default they are randomly picking from the "pool" in the site.

My answer reading this would be to setup FQDN public addresses for these boxes ... set them all up for the exact same external FQDN (like you would if you wanted to put them behind a load balancer).  In each network, simply set in DNS the singular IP for the FQDN that the client should be able to resolve and connect to.  Finally, if not already, set your clients to be in Internet-only mode on the guests (means introducing PKI, but with the kind of security you're playing with you probably have that kind of infrastructure in place).

That should give a simple, singular public FQDN connection point that resolves to the appropriate unique IP in each network security zone. 

Sadly if you were running Windows vNext (not out yet dangit!) you could have used conditions on your DNS and done it a bit more cleanly that way, but that doesn't mean we will have a problem if you use a singular DNS topology for all your network regions, just make said FQDN it's own subzone which is unique for each network (don't replicate it) so the rest can stay shared.

Basically every domain will have a subzone called "sup.contoso.com" with a root A record of whatever.  You ONLY include that FQDN as it's own subzone .... let the parent contoso.com continue to be forwarded appropriately.  That way your DNS stays mostly clean, and you ONLY worry about the one FQDN of contention.

Do that, and now you have a simple, common name for all your SUPs that your clients can fall back onto regardless of security zone.


Free Windows Admin Tool Kit Click here and download it now
May 28th, 2015 6:16pm

My "Fast suggestion" to your problem as I understand it.

If I read things correctly, your various machines are in untrusted domains and only have rout-able access to specific SUP boxes but default they are randomly picking from the "pool" in the site.

My answer reading this would be to setup FQDN public addresses for these boxes ... set them all up for the exact same external FQDN (like you would if you wanted to put them behind a load balancer).  In each network, simply set in DNS the singular IP for the FQDN that the client should be able to resolve and connect to.  Finally, if not already, set your clients to be in Internet-only mode on the guests (means introducing PKI, but with the kind of security you're playing with you probably have that kind of infrastructure in place).

That should give a simple, singular public FQDN connection point that resolves to the appropriate unique IP in each network security zone. 

Sadly if you were running Windows vNext (not out yet dangit!) you could have used conditions on your DNS and done it a bit more cleanly that way, but that doesn't mean we will have a problem if you use a singular DNS topology for all your network regions, just make said FQDN it's own subzone which is unique for each network (don't replicate it) so the rest can stay shared.

Basically every domain will have a subzone called "sup.contoso.com" with a root A record of whatever.  You ONLY include that FQDN as it's own subzone .... let the parent contoso.com continue to be forwarded appropriately.  That way your DNS stays mostly clean, and you ONLY worry about the one FQDN of contention.

Do that, and now you have a simple, common name for all your SUPs that your clients can fall back onto regardless of security zone.


May 28th, 2015 6:16pm

Dears,

It took over 12h for some clients to switch SUP but my 6 tests servers are switching SUP in the end, and now they are speaking with the one they should be ... So the problem is solved now. I don't know why it took so long for my first test server ...

Thanks a lot for your time

Free Windows Admin Tool Kit Click here and download it now
May 29th, 2015 9:01am

So they are switching based on the addition of the error code?

If so, that's good to have confirmation that it works.

May 29th, 2015 9:40am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics