VM's status is missing on one node but is up and running on another node (SCVMM 2008 R2)
Hi!

We had a server (node) not responding, we first tried to migrate the VM's from this node to another. When it timed out, we had to reset the power of the server and another node started up the vm's.

Now to the real problem, in VMM 2008 R2 the migrated vm's are started on one node, and are reported missing on the original node (the one owning the vm prior to the reset). If I look in hyper-v manager there is no vm's on the original node. And if I look in Cluster Manager everything looks fine.

So how can I ged rid of the orphaned vm's? They are there but I cannot do anything.. There is a repair option, but all options are greyed out. There is a "Delete" option, but that one deleted the config of the vm (it warns that the vm-file is already in user by another VM).

Any suggestions?

Regards,
Alexander
  • Edited by Alexander Mansson Monday, October 19, 2009 12:43 PM added info to subject
October 19th, 2009 12:19pm

Hiya,

If you delete a server that is in "missing" state fromSCVMM it should not delete the files - Thats the reason that its reported missing, it cant see any files.

Are your host min SP2 ?


Free Windows Admin Tool Kit Click here and download it now
October 19th, 2009 1:32pm

Hi!

The host is Windows Server 2008 R2, the vm's are on a cluster shared volume.

If I try to delete I get the message

"Error (802)

The VM file SRV002.domain.local is already in use by another VM.

Recommended Action

Wait for the object to become available, and then try the operation again."

Regards,
Alexander

October 19th, 2009 2:01pm

Alexander,

The easiest fix for this particular issue is to remove your cluster from Virtual Machine Manager and re-add it. When the cluster is re-added, it will refresh all the guests, and you'll be back to normal. We have to walk through this process once every now and then when something goes horribly wrong on one of the cluster nodes. Unfortunately, there's not a better way to deal with this particular issue at this time. One thing to be aware of: Removing/Readding the cluster to VMM can sometimes inadvertently trigger a reboot on the Hyper-V servers. The reason for this is because VMM looks for a pending reboot flag in the registry of the Hyper-V host, and if it's set, it will send a reboot command to the host (killing all of the VM workloads and wreaking havoc on your infrastructure), so before you perform the operation, you'll want to verify that this bit isn't set on the registry of any of your Hyper-V hosts (some Windows Updates or app installs can set the key but never actually reboot the machine). A script you can perform to check for pending reboots is:

Function Check-PendingReboot ($computer = "$(throw "Please pass a computername")")
{
$baseKey = [Microsoft.Win32.RegistryKey]::OpenRemoteBaseKey("LocalMachine", $computer)
$key = $baseKey.OpenSubKey("Software\Microsoft\Windows\CurrentVersion\Component Based Servicing\")
if ($key.GetSubKeyNames() | ? {$_ -eq "RebootPending"}) {"There's a reboot pending.  Don't upgrade VMM!"}
else {"No reboot pending.  You're OK to upgrade VMM.  :)"}
$key.Close()
$baseKey.Close()
}
Thanks,

Janssen

Free Windows Admin Tool Kit Click here and download it now
October 21st, 2009 3:19am

Thank you Janssen!

I will try this and post back with the result.

Because it is only an annoyance, I will try this as part of our maintenance plan allthough I will run the script first and see if it works without reboot (good to know for the next time.. :).

Regards,
Alexander
October 21st, 2009 6:09am

Hi,

I was having the exact same issue. I was running SAN failover tests to see how hyper-v + clustering would react and got these duplicate virtualmachines. Now the real problem is that I can't remove the cluster from VMM.


command: Remove-VMHostCluster
status: failed
Error (801)
VMM cannot find VirtualHardDisk object .
Recommended Action
Ensure the library object is valid, and then try the operation again.


I've also tried from powershell by using -force switch with no avail. Now I'm just totally stuck with this cluster.

The vluster is 3 node cluster using CSV. One node got removed succesfully but one is in pending state permanently now.

I wouldn't like to reinstall the SCVMM R2 because it's managing other clusters and hosts.
Free Windows Admin Tool Kit Click here and download it now
January 4th, 2010 11:26am

Will this lose any information from the virtual machines such as description, quota, owner, costcenter, selfserviceuserrole, creationsource etc? We have around 100 virtual machines used mostly from the self service portal so this information is pretty important.

January 28th, 2010 4:30am

Hans,
I believe that *will* delete all that data.  Unfortunately, at the moment, I don't believe there's a supported way to clean up broken/missing VMs other than the workaround listed above.   Perhaps if you open a ticket with PSS/CSS, they could assist you in manually cleaning up the VMM DB.  I'm filing an issue with the VMM team to point out this issue, and hopefully we'll see a better fix/workaround in the future.
Janssen
Free Windows Admin Tool Kit Click here and download it now
January 28th, 2010 3:53pm

Just had this issue occur, is a better solution pending?
March 24th, 2010 11:42pm

Aaron,

I'm not sure where Microsoft is in the way of a patch on this one.  You *can* directly clean it up inside the VMM DB, but it's not supported or recommended (but we've done it on a few occasions).  A sysadmin on my team actually wrote a PowerShell script to clean up the DB, but I wouldn't feel comfortable distributing it, given the unsupported nature, and potential for risk.  However, if you want to poke around, you can look in the DB at the following tables:  tbl_WLC_VObject, and tbl_WLC_VMInstance.  Do so at your own risk.  Hopefully a fix will show up soon.

Free Windows Admin Tool Kit Click here and download it now
April 2nd, 2010 3:16am

Janssen,

Thanks for that info, we did the "remove host cluster, re-add host cluster" method but it's a little time consuming and loses history and config information. If I need to do it again I'll look at those tables.

April 7th, 2010 7:57pm

Michael Michael just informed me that we've finally got some working solutions available via some scripts he's just posted to his blog.

One script allows an export/import of VMM metadata.  

The second script is a SQL script that deletes missing VMs directly out of the DB.  Thanks, Michael!

Free Windows Admin Tool Kit Click here and download it now
April 16th, 2010 8:23pm

Hi!

We got this issue once again... Will try that SQL script next tuesday (planned downtime), and will report back with the results.

Regards,
Alexander

May 5th, 2010 6:42am

Hi!

The results were successful, all missing VM's where removed when following the suggestion from Janssen (with the SQL script).

Thank you for your help!

Regards,
Alexander

Free Windows Admin Tool Kit Click here and download it now
May 18th, 2010 12:04pm

It's not optimal, but shutting down the VM and moving it from one node to another via Failover Cluster Manager seems to cause SCVMM to again find the "Missing" VM. I imagine peforming a Live Migration would also work, though I have not had the opportunity to try it yet.

I guess the integration between SCVMM and Hyper-V clusters still needs some work...

June 25th, 2010 8:30pm

I can confirm that doing a live migration using failover cluster manager did indeed resolve this issue for me.  No need for database spelunking and I was able to retain my custom attribute data.

In my case I had a handful of VMs "missing" after having to shutdown all hosts for a maintenance event.  

Hopefully this will work for others as well.

Free Windows Admin Tool Kit Click here and download it now
October 7th, 2014 7:42pm

I just had this happen using all 2012R2 Cluster and SCVMM. It occurred after I used Cluster Aware Updating. Half of my VMs showed as Missing. The quick fix was to simply use Failover Cluster manager to Live Migrate the VMs to the Hosts where SCVMM last saw them. They Hosts auto refresh and the go back to a good Running status. Easy Peasy.

For me, the real question in why the heck does this happen in the first place using built in tools? Shouldn't the SCVMM agents keep up with the movement genereated by CAU feature of the cluster?

Is there a better way to do it until the features are in sync? After Rollup 7, you'd think they would be.

September 11th, 2015 2:39pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics