Adding a WS2012 Hyper-V Cluster to SCVMM2012 SP1 causes volumes on Dell EqualLogic to go offline

Hi,

I've recently built three WS2012 Hyper-V clusters with Dell EqualLogic storage.  The cluster builds were straight forward and are performing well.

The same issue has happened on all three whenever doing the push install of the VMM agent to the hosts and adding the cluster to VMM.  On each occasion the cluster volumes on the Dell EqualLogic were taken offline for a period, causing the virtual machines on the volumes to pause in a critical state.

After SCVMM was finished adding the hosts, the volumes could be brought back online and the virtual machines started.  Bit heart stopping though the first time it happened!

Whilst it happened, lots of errors event ID 5120, 5142, 1557, 1558, and 1069 appeared in the logs of each host - basically relating the volumes going offline, but not helping to point out how or why.

In all of the cluster builds affected "Do not allow cluster communication for this network" has been selected for the iSCSI network.

The VMM logs had the following "Completed w/ Info" warning for each host after adding the cluster "Warning (26211) A restart is required to complete claiming of multi-path I/O devices on host <host FQDN>).

I'm wondering if there is something strange happening with the new SMP storage management capabilities of VMM 2012 SP1?  I didn't ask VMM to try to manage the storage whilst adding the Hyper-V hosts, so why it should interfere with the storage I don't know :(

Anyone ran into something similar?  Would like to get to the bottom of it as Hyper-V with Dell EqualLogic storage is a very common build for us.

Cheers

James


January 25th, 2013 2:57pm

Same problem here but with DataCore SANsymphony-V iSCSI storage. The problem appeared when I tried to install the agent on a W2012 Hyper-V host. Do not dare to install the agent on our W2008 R2 Hyper-V cluster.

Someone who has an explanation or solution to this?

The hotfix 2813630 only applies to W2012 clusters.


  • Edited by AndersP Thursday, June 13, 2013 8:55 AM
Free Windows Admin Tool Kit Click here and download it now
June 13th, 2013 8:49am

Hi, 

we had 3 different Clusters, with 2 , 4 and 9 Nodes in a W2012R2 Cluster .  ( All on Equallogic PS6100 ( 2 Pool) , Hit Kit 4.7 )

We are receiving the above Error suddenly without any Logic we can find . ( We contacted also Dell/Equallogic but they cant help ). 

The Problems got for less after the December Rollup , it just a shame that this ist still a Problem.

It seems from our Point of view that is matters how many Nodes are in the Cluster , and how many CSV are there

2Node cluster+ 1csv , nevern seen the iussue

4Node cluster + 2CSV  , seen the iussues 5 Times

9Node cluster + 8 CSV, seen the iussues a lot ( dont have a number ( above 10 ) 

If that Problem occures some VM seem to last , some report "no booting device" and Linux Server going "zombie" and you have to reboot them . 

Has someone with that Problems also the Problem that the PagePoleSeize on the Node who is have the Cluster role is contantly rising ? ( you can check with the Performance Counter  Memory\Pool Paged Bytes , we lost GB of RAM before we found a way to see where the RAM was lost ) 

Regards

Markus


  • Edited by vielm Wednesday, April 15, 2015 12:00 PM
April 15th, 2015 11:57am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics