Unstable PXE and WDS in a large environment
Hi Everyone, I am at a customer site. The customer has one Central Site and 197 Secondary Sites. The Central Site is based on Windows Server 2003 R2 x64 with all the latest patches. All the SCCM Sites are running SCCM 2007 SP2 R2. We are supporting inland SCCM Sites and SCCM Sites worldwide. The SCCM inland sites are running Server 2003 R2 x64 and the sites worldwide are running Server 2003 R2 x86 (maybe this is important). Every morning we have to check the site status of quite a lot of sites to see if the WDS Server hast stopped responding by checking the pxecontrol.log for entries containing "Error receiving replies". If we get these entries we know for sure that the WDS Server has switched to an undefined state where it looks as if it is running but in reality it is not. The people on site are then complaining that F12 to install a maschine with an OS is not working. If you then click on pending devices after a while you will get an error message saying the RPC service could not be found. If you then refresh the console screen you will notice that the WDS Server has stopped. If you then restart the WDS Server the problem is gone for a couple of days. This is an unbearable situation and it is costing us many hours to keep the WDS/PXE environment up and running. The interesting detail is that mostly the Arabic countries running on a 32 bit Windows 2003 R2 are having these "undefined state of WDS Server". The inland sites here in Europe hardly ever have a problem but this could also be because they are running X64 Server 2003 R2. What can we do to stabilize are WDS/PXE? Are other SCCM admins with a similar setup and number of sites have this problem as well?WDS and PXE are located on the same machine. DHCP is running on another server. All Secondary Site are set up the way with these roles: DP, PMP and PSP.We openend a call with Microsoft a while back but they only said wait for SCCM Sp2 and R2. The situation has improved a little, however a permant solution has not been provided by Microsoft.Thanks for your help.Regards,Nils
February 12th, 2010 1:21pm

Not a fix for the problem just the symptom, but why not use OpsMgr to monitor the log for the error you've noted above and restart the service automaticaly? Or at least write a script that does the same that you can schedule via ConfigMgr.Jason | http://myitforum.com/cs2/blogs/jsandys | http://blogs.catapultsystems.com/jsandys/default.aspx | Twitter @JasonSandys
Free Windows Admin Tool Kit Click here and download it now
February 12th, 2010 5:18pm

Morning Jason, that is exactly the next step that I wanted to try. I wrote to the Serverteam if they can setup OpsMgr to monitor the pxecontrol.log for the phrase "Error receiving replies" and that we a. get an email but also b. to restart the service would be a grand idea too.Hopefully the team can help us out here I will keep the people up to date in this thread.Nils
February 15th, 2010 11:20am

Hi Nils, We are having the exact same issue here as well. We currently have a Central, 2 Primaries and roughly 40 or so DP's and are experiencing this quite often. Our central/primaries are running 2003 SP2 R2 and SCCM 2007 SP2. Our DP's are a mix of Server 2003 and Server 2008 (non-R2). We have been working with MS for a while however we have yet to find a permanent solution to the problem.I like the idea of using OpsMgr to monitor the log file. I'm not familar with that product so I'll have to do some asking around.Dustin
Free Windows Admin Tool Kit Click here and download it now
February 18th, 2010 9:11pm

Hi Dustin, I might be able to help you by providing a link that shows a similar "trick" that people try to do by OpsMgr detecting a word or phrase in a log file. This is the link that holds the information: http://social.technet.microsoft.com/Forums/en/operationsmanagergeneral/thread/e91a3893-22ad-467f-9e7b-186ac52f4e7bIt is really sad to see that others are struggling with this instability too. As soon as you hear or find the solution I would love to hear the answer Dustin. We have a similar case open with MS at the moment as welll.Lets make it a race who finds the answer the quickest.Nils
February 22nd, 2010 1:33pm

Hi, im experiencing a similar issue, if not even the same issue, in a medium scale environment. The wds is behaving like Nils SCCM is describing in the first post... no errors or warnings in wdsserver.log, sccm logs och event viewer. It just drops and sometimes it restarts itself sometimes not. Have you found a soloution other then monitoring the service with opsmgr and restarting it? Im empty on ideas... Thanks!
Free Windows Admin Tool Kit Click here and download it now
August 4th, 2010 4:52pm

Hi all, I haven't tried this myself but this hotfix sounds like it might address your problems. If it's of any consolation, I'm having a similar problem but only have 1 site with a few hundred comps. The Windows Deployment Service stops responding when you use a PXE service point on a computer that is running a System Center Configuration Manager 2007 SP1 or SP2 site server This issue occurs because of a timing issue in the PXE service point. This timing issue causes a deadlock. Therefore, the Windows Deployment Service stops responding. http://support.microsoft.com/kb/976073/
August 5th, 2010 5:01pm

Thank you for your answer. Unfortunately, ive already tried the kb976073 without success...initially i thought "this must be it!" but there was no happy response what so ever from the server...
Free Windows Admin Tool Kit Click here and download it now
August 9th, 2010 5:17pm

Unfortunately, after having the patch installed for a few days we still have the same problem too.
August 11th, 2010 6:18pm

Still no luck? We have an active MS case on this issue.
Free Windows Admin Tool Kit Click here and download it now
September 17th, 2010 12:10pm

The issue has been resolved for us by setting Software\WOW6432Node\Microsoft\SMS\PXE\NumberOfThreads to 1.
October 20th, 2010 12:01pm

Hello Guys, I gues as LudwigLoh states, that the case was resolved when setting the Software\WOW6432Node\Microsoft\SMS\PXE\NumberOfThreads to 1. According to following KB from Microsoft, this problem seems to be well documented here: http://support.microsoft.com/kb/2510665 I will evaluate & test this in my enviorenment incl. checking the IP Helpers part (Step1 of the KB) Good Luck.Best Regards Anders Horgen
Free Windows Admin Tool Kit Click here and download it now
August 25th, 2011 9:44am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics