Hyper-V over SMB 3.0 poor performance on 1GB NIC's without RDMA

This is a bit of a repost as the last time I tried to troubleshoot this my question got hijacked by people spamming alternative solutions (starwind) 

For my own reasons I am currently evaluating Hyper-V over SMB with a view to designing our new production cluster based on this technology.  Given our budget and resources a SoFS makes perfect sense.

The problem I have is that in all my testing, as soon as I host a VM's files on a SMB 3.0 server (SoFS or standalone) I am not getting the performance I should over the network.  

My testing so far:

  • 4 different decent spec machines with 4-8gb ram, dual/quad core cpu's, 
  • Test machines are mostly Server 2012 R2 with one Windows 8.1 hyper-v host thrown in for extra measure.
  • Storage is a variety of HD and SSD and are easily capable of handling >100MB/s of traffic and 5k+ IOPS
  • Have tested storage configurations as standalone, storage spaces (mirrored, spanned and with tiering)
  • All storage is performing as expected in each configuration.
  • Multiple 1GB NIC's from broadcom, intel and atheros.  The broadcoms are server grade dual port adapters.
  • Switching has been a combination of HP E5400zl, HP 2810 and even direct connect with crossover cables.
  • Have tried stand alone NIC's, teamed NIC's and even storage through hyper-v extensible switch.
  • File copies between machines will easily max out 1GB in any direction.
  • VM's hosted locally show internal benchmark performance in line with roughly 90% of underlying storage performance.
  • Tested with dynamic and fixed vhdx's
  • NIC's have been used in combinations of RSS and TCP offload enabled/disabled.

Whenever I host VM files on a different server from where it is running, I observe the following:

  • Write speeds within the VM to any attached vhd's are severely effected and run at around 30-50% of 1GB
  • Read Speeds are not as badly effected but just about manager to hit 70% of 1GB
  • Random IOPS are not noticeably affected.
  • Running multiple tests at the same time over the same 1GB links results in the same total through put.
  • The same results are observed no matter which machine hosts the vm or the vhdx files. 
  • Any host involved in a test will show a healthy amount of cpu time allocated to hardware interupts.  On a 6 core 3.8Ghz cpu this is around 5% of total.  On the slowest machine (dual core 2.4Ghz) this is roughly 30% of cpu load.

Things I have yet to test:

  • Gen 1 VM's
  • VM's running anything other than server 2012 r2
  • Running the tests on actual server hardware. (hard as most of ours are in production use)

Is there a default QoS or IOPS limit when SMB detects hyper-v traffic?  I just can't wrap my head around how all the tests are seeing an identical bottleneck as soon as the storage traffic goes over smb.

What else should I be looking for? There must be something obvious that I am overlooking!

 


December 19th, 2013 2:03pm

Hi sorry I don't understand what you mean by this can you expand :-

Whenever I host VM files on a different server from where it is running, I observe the following:

I presume you have configured the SOFS cluster with CSV volumes, and have CSV caching enabled.

Hyper-V role must not be installed on a SOFS cluster. you need RSS enabled, check the Bios settings of the NIC to make sure its in advanced mode, normally RSS will limited to 4 queues, but advance mode enables 16.

By nature of a SOFS reads are really good, but there is no write cache, SOFS only seems to perform well with Disk mirroring, this improves the write performance and redundancy but halves your disk capacity.

Power shell is your Friend, do some very large file copies, and use get-SMBmultichannelconnection and other get-SMB commands to actually make sure your traffic is using smb 3.0 and you have an SMB connection to your cluster.

Cheers

Mark


Free Windows Admin Tool Kit Click here and download it now
December 19th, 2013 6:13pm

Hi Mark,

Host A has local storage made up of 2 SSD's and 2 HDD.  I create a share on host A.  Host B is running Hyper-v and is then configured with a running VM whose vhdx files are stored on Host A.  Internal disk benchmarks within that VM show poor sequential read and writes that generate at most 800mbps of network traffic.

I have yet to test hosting on a SoFS as I assumed there would be no major performance difference between hosting Hyper-V files on a SoFS and a stand alone File server (only the lack of HA)?

Is it a case that the way a SoFS handles hyper-v storage traffic is significantly different from a stand alone File server and that's why I am losing the performance?  This would be really counter intuitive!  Could it be a case that through lack of write cache the read and write io's on a stand alone server are experiencing extra delays on file operations that could explain the loss of performance?

If the hyper-v server tries to write to a vhdx.  The write operation is sent to the file server.  the file server then has to commit the write to disk before it can then report back to the hyper-v server that the write has succeeded?  Hence all io is without cache and as such really poor?

 

December 19th, 2013 6:31pm

Hi , double check you actually have an SMB 3.0 connection between Servers,

run get-smbclientnetworkinterface and make sure your NICS are RSS capable.

Running get-smbconnection will show you the actual smb connections if any .

SOFS provides massive difference not just HA, you can define Shares designed to host hyper-V traffic, and enable CSV caching, which uses the system ram of the hosts to boost read performance.

it almost sounds like you are only getting normal SMB 2.0 traffic, by the performance figures you have quoted

get-smbopenfile will show you any open virtual server files using smb 3.0

thanks

Mark

Free Windows Admin Tool Kit Click here and download it now
December 19th, 2013 6:46pm

To be fair if you have a single VM, with a VHDX sitting on and SSD on another server the performance should be pretty good with or without cache.

hence I would check what version of SMB the traffic is using.

this command shows my server is using smb 3.02 on this connection.

December 19th, 2013 6:56pm

Hmmm,

So I tested a Gen 1 machine, interestingly it was marginally faster.  Getting near to 80% of 1GB now!

Regardless, I checked with the powershell commands and confirmed that all the connections are running at smb 3.02

Only one of the test servers has rss capable nic's but I cannot see any noticeable difference between using rss and not on those adapters.

I'm still interested as to why all servers involved in the test would see such high system and hardware interrupts when doing storage operations? 

Just for a visual aid, here is a benchmark run locally on the file server:  (note this is a 1tb mirrored storage space consisting of 100gb of ssd tier and 900gb of hd tier)  The performance is lovely! :)

And here is the same benchmark performed inside the vm running on host B with its files stored on that same storage space on the file server:

The IOPS I can live with as that is not too bad a loss if we have 3x host servers pointing to the same sofs.  But the sequential,  thats a serious hit!  

Plan tomorrow is to add a second nic to host B and test with 2x 1GB NIC and see what happens. (the file server has 3 1gb NIC's to play with atm)

Free Windows Admin Tool Kit Click here and download it now
December 20th, 2013 1:39am

@VR38DETT - Thanks for the lecture in storage and raid basics!  I think its safe to say that both me and Mark understand the basics of raid and mirroring.  I would appreciate it if you could be more constructive and polite in your responses.  Mark has been very helpful so far in offering constructive and helpful ideas for how to move forward.  All you have done is find petty faults with some of our comments to flame and argue against.  

To answer your question:

If I do a standard file copy over 1GB I will usually see ~100MB/s when the adapter is at 100% load.  iSCSI is no doubt more efficient than SMB at the same line speed and will result in higher data throughput.

The problem is I am not seeing full line speed so this comparison does not apply.  If a simple test like this cannot utilize a single 1Gb nic effectively, the physical network transport is clearly not the bottleneck.  Adding more NIC's or 10Gb NIC's is not going to speed this up.  

Your comment on multichannel and SoFS is barely readable as English.  Perhaps you should take a little time to read over your responses before posting.  

I can work out that you are suggesting 2 servers is better than one in a SoFS scenario, however having extra file server nodes will not improve access speeds for a single vhdx file to single hyper-v host, as only one SoFS node can serve the vhdx file at a time.  If I was running lots of hyper-v hosts and vm's then yes, having an extra file server would help massively.

My next step is to add 2 more 1Gb adapters to the hyper-v host so there is 3x 1Gb of bandwidth available between the servers.  Then see how this effects the storage throughput.  

If all else fails then the next step is to build an actual cluster to test the SoFS specifically and see if it has any extra magic that will use the bandwidth better.   This will have to wait till after christmas as I will need to set up all sorts of extra stuff to get this running, not least a starwind san for temporary shared storage :)



  • Edited by Dan Kingdon Thursday, December 19, 2013 11:41 PM
December 20th, 2013 2:39am

one of the tests I did was to build two hyper-V guests, and transfer large files between them. on our 10GB network, I was seeing about 800MB/sec which is about what I expected.

over 1Gbyte connection you will only see, 80-90GB/sec performance.

RSS should provide much better performance,  as it provides multichannel connections.

there is a great blog if you haven't already seen it this guy is the oracle for SOFS and SMB 3.0

I have found some great stuff on here which has helped me greatly lots of performance tuning information

http://blogs.technet.com/b/josebda/

Cheers

Mark

Free Windows Admin Tool Kit Click here and download it now
December 20th, 2013 12:27pm

[ ... ]

By nature of a SOFS reads are really good, but there is no write cache, SOFS only seems to perform well with Disk mirroring, this improves the write performance and redundancy but halves your disk capacity.

[ ... ]


Mirror (RAID1 or RAID10) actually REDUCES number of IOPS. With read every spindle takes part in I/O request processing (assumimg I/O is big enough to cover the stripe) so you multiply IOPS and MBps on amount of spindles you have in a RAID group and all writes need to go to the duplicated locations that's why READS are fast and WRITES are slow (1/2 of the read performance). This is absolutely basic thing and SoFS layered on top can do nothing to change

December 20th, 2013 12:43pm

So probably the last update before we finish here for this year.  

I've just run another test with both the file server and hyper-v host having dual NIC's each.  Verified that SMB 3.02 was working and that the expected 2x multichannels were found and working.

This is a snapshot of the load during a disk mark sequential read/write

On the Hyper-V host:

And on the file server:

Free Windows Admin Tool Kit Click here and download it now
December 20th, 2013 2:47pm

And here is the resulting benchmark:

  1. Note that the combined network throughput is yet again only totaling around 80% of a single 1GB (2x 40% in this case with multichannel, clearly the network is not the bottleneck.)
  2. The vm benchmark is almost identical, yet again backing up our assumptions that the network is not the bottle neck.
  3. The cpu load on the file server is almost twice what it was with a single NIC (I would put this down to the extra overhead of multichannel?)  I know at 58% this is pretty damn high, I will have to find a more powerful machine to swap this out with and try again to make sure the cpu is not the bottleneck.
December 20th, 2013 2:50pm

I've just updated the cpu of the file server to a newer machine (6 core cpu @ 3.8), same results with benchmarks.

And for final completeness, here is a file copy from the hyper-v "host" to the file server:

Have a good Christmas all, i'm off for drinks in with colleagues.

Free Windows Admin Tool Kit Click here and download it now
December 20th, 2013 3:37pm

[ ... ]

You: By nature of a SOFS reads are really good, but there is no write cache, SOFS only seems to perform well with Disk mirroring, this improves the write performance and redundancy but halves your disk capacity.

[ ... ]

Me: Mirror (RAID1 or RAID10) actually REDUCES number of IOPS. With read every spindle takes part in I/O request processing (assumimg I/O is big enough to cover the stripe) so you multiply IOPS and MBps on amount of spindles you have in a RAID group and all writes need to go to the duplicated locations that's why READS are fast and WRITES are slow (1/2 of the read performance). This is absolutely basic thing and SoFS layered on top can do nothing to change this.

[ ... ]

You: Not wanting to put the cat amongst the pigeons, this isn't strictly true, RAID 1 and 10 give you the best IOP performance of any Raid group, this is why all the best performing SQL Cluster use RAID 10 for most of their storage requirements

[ ... ]

With your very first post you told mirroring is improving write performance. I have only one question: HOW?

December 21st, 2013 1:42am

Hi,

Just want to confirm the current situations.

Please feel free to let us know if you need further assistance.

Re

Free Windows Admin Tool Kit Click here and download it now
December 26th, 2013 10:47am

So currently the same issue, with a single file server, no matter the number of SMB multipaths available I cannot get over 85-90% of a single 1GB.  

I will be testing the once I am back at work in a week or so:

  1. Use my home lab to setup a simaler set of tests and see if the results are repeatable on completely different hardware/domain/network etc...
  2. Begin planning a proper SoFS test which will take some time due to the extra server involved (shared storage etc...)

I will post up my results when I have them.


  • Edited by Dan Kingdon Friday, December 27, 2013 11:00 AM
December 27th, 2013 1:58pm

Hi!

On the File Server side, what is the output of these cmdlets

  • Get-NetAdapter
  • Get-SmbClientNetworkInterface

On the Hyper-V side, what is the output, with a VM actively running, of this cmdlet

  • Get-NetAdapter
  • Get-SmbServerNetworkInterface
  • Get-SmbConnection
  • Get-SmbMultichannelConnection

This can help figure out why you're not using multiple paths.

My main suspicion is that you have a mix of RSS and non-RSS capable NICs, which could lead to SMB choosing to use them at the same time. You can read more at

http://blogs.technet.com/b/josebda/archive/2012/11/10/windows-server-2012-file-server-tip-make-sure-your-network-interfaces-are-rss-capable.aspx

There are more network troubleshooting tips for file servers at

http://blogs.technet.com/b/josebda/archive/2014/01/25/troubleshooting-file-server-networking-issues-in-windows-server-2012-r2.aspx

Jose Barreto

Free Windows Admin Tool Kit Click here and download it now
February 6th, 2014 9:44pm

Hi Dan,

Maybe a long shot but I've seen some performance issues on network adapters with the wrong BIOS settings.
I know with Dell and HP servers, in the bios there are settings like "System profile" which you can set to "Maximum Performance" (Default is "Performance per Watt" I believe).
Also check the power options in the control panel and choose the High performance plan.

See this blog by Didier Van Hoye:
http://workinghardinit.wordpress.com/2011/07/01/follow-up-on-power-options-for-performance-when-virtualizing/

February 7th, 2014 11:23am

Hi Darryl,

Interesting suggestion.  I hadn't thought of power settings yet!  Sadly the old test setup has been canabalised for another project but I will get a new one up and running soon to test the theory.  

We also have managed to purchase a set of Chelsio RDMA 10Gb adapters that we will be using very soon to test the 10Gb performance with the same hosts.

I'll post up any findings once I test again.

Free Windows Admin Tool Kit Click here and download it now
February 7th, 2014 12:15pm

Hi Dan,

did you check if all your NICs are RSS capable? If yes, you have to enable vRSS feature inside the VM to obtain the maximum performance.

Note that vRSS is disabled by default.

Cheers

David

PS

10Gbps NICs will help you a lot with the performance. Use the most update driver and firmware released (and tested) by the HW vendor (Dell, HP, IBM and so on) and test, test, test

 

February 10th, 2014 7:20pm

Hi Dan,

Did you get any further with troubleshooting?
Free Windows Admin Tool Kit Click here and download it now
June 23rd, 2014 1:03pm

Hi, 

Sadly I was never able to find the solution to this problem.  We have since moved over to 10gb RDMA NIC's and are having similer issues.

http://social.technet.microsoft.com/Forums/en-US/dcdeda94-0505-4ab5-beda-f687f73c1aba/hyperv-with-sofs-appalling-write-performance?forum=winserverClustering

  • Marked as answer by Dan Kingdon Monday, June 23, 2014 10:26 AM
June 23rd, 2014 1:26pm

Dan,

I ran into an issue as I was standing up some servers in a Hyper-V cluster recently that was ultimately caused by Virtual Machine Queue (VMQ) configuration.  The NICs required a firmware upgrade to get everything working at the expected throughput of the 1GB NICs.  

It's been a while since I looked into it, but you can quickly determine if this is the source of your issues by pinging from the VM residing on Host B to Host A.  Note the ping time.  Then, disable the Virtual Machine Queues on each NIC.  Repeat the ping test.  In my case, the ping with VMQ'ing enabled was around 30ms and dropped to < 1ms with it off.  

The final solution was a firmware upgrade on the NIC's.  Once done performance was as expected.

Please Note:  I didn't take the time to read all responses so I apologize if this has already been addressed in the thread.  Good Luck!

Free Windows Admin Tool Kit Click here and download it now
January 30th, 2015 6:14pm

Just as an update on our situation.  

We did ultimately manage to work around this and have been extremely happy with storage spaces and SMB3 since then.  The summary for us was as follows:

  • Extremely poor sequential write performance with standard drives.  Even with 8x SAS 7.2k in mirror it strugles.
  • Old sata ssd's that simply were not up to scratch.
  • small amounts of ssd's causing a low column count for the matching slow hd teir.  In future I will be more wary of this, especially when specing for general purpose file storage (for vm storage this is fine)
  • the new 10GB Nics we used had an issue with certain offload features.  After extensive testing we found the correct combination of tcp/udp offload settings

For us VMQ was not an issue as this was traffic between hyper-v hosts and the SOFS.  Not vm traffic.

Hope this helps

  • Marked as answer by Dan Kingdon 1 hour 58 minutes ago
February 2nd, 2015 4:28am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics