SCCM Slow Multicast with 2008r2 and 36gb WIM Hash Errors

Hi Guys,

I have a huge problem. Heres the dp

SCCM 2012 running on Windows Server 2008r2 on ESX with a VMXnet3 10gb Uplink

Autocast

1gb to the client

Im getting 20-30 % client nic usage on a single client autocast when i put 10 on the autocast client nic useage drops to 0.5-2%

On the Server we are seeing next to nothing i think it is 1,500,000bps under resourse monitor

We have all Cisco Gear, We have tried all sorts of settings currently set at

ApBlockSize=1385

TpCacheSize=7550 we have also tried 3145 and it makes no diffence to the speed.

And it takes ages to deploy a 36gb wim file and usual about 40% of the time we get Sparse file encouted when decompressing the wim after file transfur then a hash check failed in the SMSTSlog file

Cisco Guys have looked no issues from their end....

is it the VMXnet / 2008r2 / WDS setting anyone else got these horrible teething issues its so close the the end of the year if i have to go to using ghost to deploy 4500 machines to prevent corruption.. ill go crazy..

Some people are saying this occours with a 2008r2 distrubution point is this true?

I have read all these none of them make any difference

http://support.microsoft.com/kb/2582106?wa=wsignin1.0

http://social.technet.microsoft.com/Forums/en-US/configmgrosd/thread/21df19b0-49bf-458e-953a-90a8712e5b50/ http://social.technet.microsoft.com/Forums/en-US/winserversetup/thread/a9e5291d-4665-4b33-9376-4fcd697f4975/ http://social.technet.microsoft.com/Forums/en-US/winserversetup/thread/8e4c2df0-23ed-4ab9-811e-f2011b5b822d/

  • Moved by arnavsharmaMVP Saturday, August 24, 2013 1:42 PM posted in SCCM 2007 forums
November 26th, 2012 4:29am

Somewhat of a side question, but why are you multicasting? Are you deploying all of the systems at the same time and are they all co-located?

Also, why do you have a 36GB WIM? That's pretty excessive in general.

Free Windows Admin Tool Kit Click here and download it now
November 26th, 2012 1:59pm

We have over 4500 machines to image in 3 weeks, we would like to be able to re image around 200+ machines at a time. Our computer labs have 80 to 100 machines in each room. It would be silly to Unicast that number of machines, Being that we get about 6 onto a DP and it slows right down.

The WIM is so big as it is our Computer Programming Image it contains alot of Java (eclipse netbeans etc with custom packages downloaded) and alot of custom work done with path etc along with the usual Visual Studio Opnet Modeler etc.

I have a lab with 100 machines i have never been able to successfully deploy 100% of the machines in the room with this image I allways get around 30-50% failure. I have tried with a smaller image and in 3 tests

1st go 9gb wim slow but 100% success

2st go 9gb wim slow but 75% success

3st go 9gb wim slow but 96% success

Im wondering if i solve the slowness it may resolve the corruption?

November 26th, 2012 7:03pm

Update: Have swaped to scheduled multicast and it seems to be still slow,

However the more sessions i add the more network utilisation goes up on the server eg

1 multicast session = 1.6%

2 multicast sessions = 3.2%

3 multicast sessions = 4.8%

4 multicast sessions =  6.4%

So it seems its not an issue with the ablity to push data.. More a max value for the speed a single multicast session can run?
Free Windows Admin Tool Kit Click here and download it now
November 27th, 2012 3:30am

Ok done some more testing

We are on ESX 4.1 HW version 7

Started a ghost multicast from a physical box got about 1.6gb/m

Started a ghost multicast from the server with VMxnet3 got about .6gb/m

Started a ghost multicast from the server with E1000 got about .8gb/m (also tried Scheduled multicast WDS with this and got about 1.6-2.4% network use on the clients) :) we are getting somewhere.

I think its because we are virtualising the DP that we are experiencing Slowness in WDS multicast and perhaps some of the Hash issues. Ill try get a physical WDS box built and see if it makes a difference.

Im also getting a virtual 2008 DP built just to rule out the networking changes in 2008r2 eg Auto-Negotiate

  • Marked as answer by Yog Li Tuesday, December 04, 2012 8:26 AM
  • Unmarked as answer by Tim Jones AUT Wednesday, December 05, 2012 4:49 AM
November 28th, 2012 7:16pm

Not surprising. VMware drivers are likely culprits in lots of networking issues.
Free Windows Admin Tool Kit Click here and download it now
November 29th, 2012 12:20am

Ok more testing done ,

Physical vs Virtual = No difference

However multicast session size seems to affect it.

2 Machines multicast 1.6 GB/min

5 Machines multicast 900 MB /min

20 Machines multicast 400 MB /min

So i thought what about if i run 2x Multicast of 20 clients (i have over 100 machines in this one lab)

i get 2x sessions at 400 MB / min each. = Total 800 MB/ min so the switches can take the load but something is stoping me from pushing out more than 400 MB /min on a session with 20 clients in it.

So what about 2x multicast of 20 clients and 1x mutlicast of 5 clients

i get 2x sessions at 400 MB / min each and 1x session at 900 MB / min ........

All Intel i5 whitebox with Cisco Network. And the network guys are saying that we are hardly using any of the network bandwith avalable and the server is just lazying along even comming out of the server as you can see by the below picture

December 5th, 2012 5:00am

"Im also getting a virtual 2008 DP built just to rule out the networking changes"

Did you get a chance to try this on an earlier OS?

 

Free Windows Admin Tool Kit Click here and download it now
December 5th, 2012 5:31am

Yea im running a 2008 x64 DP now, I copied the reg settings off the 1 Gb profile and applied it to the 2008r2 and they now run at the same speed, 2008r2 without the reg is definitly slower.

We have 2 of the people who wrote ghost work for us now and they think its a software issue with the way Microsoft have written their Multicast, However I'm not going to say that is the case until it is proven,

Tomorrow I will try 

Ghost 20 machines same subnet same local switches using Ghost-cast

SCCM 20 machines same subnet same local switches (as my DP is now on my laptop so i can move it)

Got the Cisco guys coming down to the lab tomorrow to take a look at the switches

December 5th, 2012 6:15am

From memory, the settings in the console for multicast don't actually work and you typically have to go directly to the registry and edit them.

Also remember that WDS automatically throttles content delivery to the slowest client participating in a session.

I suggest you open a case with CSS as that is the only way to get on their radar and either find your issue or request changes that will benefit everyone.

Free Windows Admin Tool Kit Click here and download it now
December 5th, 2012 3:48pm

Yes I did have to manually change the settings in the registry, Also with Multicast Addresses loaded into SCCM do not apply to WDS we had to add those manually as well.

We don't have a support agreement so that would be $300 per hour..

Beginning tests now..

December 5th, 2012 7:05pm

To my knowledge, CSS incidents are not charged by the hour, they are per incident. Also, if you truly find a bug, you will be refunded the charged amount.
Free Windows Admin Tool Kit Click here and download it now
December 5th, 2012 7:12pm

Test results

10 i5 Intel whitebox 1gb Network

1 Core 2 Intel whitebox 1gb network Server  2008 same subnet

SCCM 2012 / WDS can push 469 MB /min

Ghost can push 1043MB /Min

December 9th, 2012 10:47pm

Also weird the SCCM clients are not talking back to the server where all of the ghost ones are.... Thought they would of?

Oh and all of the ghost images were successfull, 2 of the SCCM failed with the hash errors.

Dont care if its slow just want to deploy an image successfully .....

Free Windows Admin Tool Kit Click here and download it now
December 9th, 2012 10:51pm

Ok how about this, Setup 2x 2008r2 servers

1 as a SCCM DP

1 as a WDS standalone

WDS standalone = 33% nic useage on clients

SCCM DP = 4% nic useage on same clients

Just thought i post findings for comments and ill keep testing

I have also noticed that adding a client with a DH55PJ motherboard the imaging is slowed down by 10% compared to sessions with TG41TY and DH61BE clients..


December 17th, 2012 7:03pm

No Problems,

I have yet to have time to open a call with Microsoft it has been really busy with BYOD Tablet strategies Windows 8 etc etc, 

However i do offer an update,

We scrapped Multicast for Unicast this year and did over 3000 machines via unicast and only took down the phone system once, However I have had a chance to revisit the situation

Updated SCCM to 2012 SP1 (obviously updating boot media to Windows 8 PE and Drivers)

Update DP to Server 2012 

Just tested today 

Default settings perform the same 1-3% network use on clients (30 in Multicast Group) however I have now applied the following in my environment 

Change Number 1

I have modified this registry value on Server 2012 I'm not sure that it still applies to this version (Server 2008 forum) however our network guys say we should set this anyway...... hmmmmm

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters]

DWORD IGMPVersion and set it to one of the following values:

2 Support IGMP version 1

3 Support IGMP version 2

4 Support IGMP version 3 (default)

We have IGMP version 2 on our network (according to our NW guys so we have forced this to number 3 IGPMP version 2)

We also set the dword IGMPLevel to 2 (just to force the default we found the setting of 1 did not allow us to multicast at all it spent all day trying to open a connection). Found this here 

Change number 2

Registry keys under the WDS service I have changed these from the defaults in 2012  I dont use IPV6 so only changed the IPV4 values

tpExpWindowSize   = 2, tpMaxWindowSize  = 8,

Reduced MulitcastTTL to 5 this was at 32 which the Network guys suggested was far to high

Once i completed this Multicasting to the same 30 machines performance increased to 15% network utilization on the clients This was a nice increase and was completely useable for us in our environment I didnt receive any hash errors on my 47gb wim file when it decompressed either

Change number 3

Registry keys under the WDS service

ApBlockSize I increased this value to 4500 

Speeds increased on a multicast session of 30 machines to 20% network utilization on the clients

ApBlockSize I increased this to 7550 

Speed increased to 25% on the clients an acceptable level for us to begin further testing of SCCM Multicast

So....

All other settings are left default including TpCacheSize which is set in the 1100's on Server 2012 

Other things I have noticed, I have not received any hash errors on using server 2012 WDS multicast YET touch wood even using default settings 100% success all the time. However I have only imaged ~100 machines

My Server is in ESX5.1 now and has a 10Gb VMXNET3 and is on a different subnet to the clients

My Clients are 1Gb Intel Nics HP Enterprise boxs

There is 2Gb of bandwidth between the Server room and the building I am imaging in and 1Gb to the floor I am imaging on.

The switch I am on has a 1Gb uplink and 1Gb to each workstation.

The network graph from the Cisco Guys shows 300Mb/s average deployment speed

When I run 2 sessions of 30 I get 600Mb/s in total out of the server and 300Mb/s to each session so there may be even more I can push the system but for now I am happy as long as hash errors dont return.

Im using Scheduled Multicast 30 clients or time out of 5 minutes. Im keen to use Auto cast across our VLans so I will do some further testing later today.

Im not a Network engineer I dont understand the settings 100% that I have changed I also know our MTU is 1500 so this is possibly generating lots of fragmentation but IT WORKS for me and my 47gb Wim file..


Free Windows Admin Tool Kit Click here and download it now
May 1st, 2013 4:13am

That multicast throughput is ridiculous low. Ask your networking guys, whether they run PIM-SM and not something like PIM-DM or DVMRP.
August 24th, 2013 7:25am

Tim did you happen to ever get this worked out? 

We are still working on this issue, but Microsoft did mention there is a "non-offical" bug when using a virtual server, and that a WDS registry setting needs to be changed.

TpMaxBandwidth"=dword:00000001 from 100

When making this change saw a jump to around 12% network utilization. I am going to make some tweaks with the ApBlockSizeV4 to see if I can crank out even more utilization on a virtual server. Tweaking this setting to 7550 on a physical server jumped my utilization up to 25%-30%.

-Tony

Free Windows Admin Tool Kit Click here and download it now
April 29th, 2015 12:54pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics