Recovering Hyper-V virtual machines that have disappeared
This may sound strange but to anyone who has used Microsoft Hyper-V for server virtualization for any length of time knows, it is something you have either confirmed to be true or you believe you are losing your mind.
In test environments, disappearing virtual machines may have been attributed to not stopping at Starbucks that morning, assumed hallucinations due to lack of sleep from the all night server maintenance from the night before or simply doubt in the busy world of IT that the missing machine had actually been created in the first place or possibly accidentally been deleted somewhere in the fray.
In some cases, when the affected virtual machine is heavily relied upon for day-to-day business and it suddenly just isn’t there, this problem is confirmed and we start hunting for solutions.
For those that are not sure they saw correctly, I am here to confirm for you that Hyper-V Virtual Machines can and do disappear. They sometimes disappear right before your very wary sleep deprived eyes as you stare unbelievingly at your Hyper-V Management Console, straining to find that virtual machine you spent 2 weeks setting up and not finding it in the list of available machines.
What happened to it?
Well, simply, the XML configuration file for that machine has most likely become corrupt which makes the Hyper-V management service unable to process it causing the machine to not appear in the Hyper-V console.
Now don’t just go and start deleting files and copying things around. Hyper-V uses an amusingly complicated series of GUIDs cross linked in multiple XML files and similarly named folders in a particular hierarchy to keep all the virtual machines, their hard disks and their snapshots in sync. One wrong move here and you will lose a snapshot or possibly the entire virtual machine. This can also be recovered if you have the virtual disk files but that is research for another article.
Back to our corrupt XML file. The solution to this problem is really very simple. Find the XML configuration file for the virtual machine that just disappeared, make a backup copy of it and then open it in Internet Explorer. You should notice if you scroll down through the file that IE cannot render the entire XML file. This is due to some malformed XML that Hyper-V has randomly inserted into your file as a bonus just to give you more gray hair and raise your blood pressure a bit if it wasnt high enough already.
If you open your XML file in Visual Studio or some other IDE that marks up XML, you can see the malformed XML and with a little bit of examination and care, determine what needs to be fixed. I have provided an example of one of my corrupt files below. If you look carefully, you will see that there are two </configuration> tags at the end of the file and just after the first one, it appears that some of the information from just above it was mangled just a bit and then repeated resulting in some bad XML and a second end </configuration> tag. I removed everything after the first </configuration> tag, saved the file, restarted my Hyper-V Management Service, and my virtual machine reappeared in my Hyper-V management console.
I have seen other examples of this where the XML file contained control characters or symbols which will cause the same problem. Removing those characters and restoring the proper XML format usually fixes the problem.
So, when it appears that all is lost, it is actually a very simple fix. Special thanks to Microsoft for giving me a heart attack and hopefully they can tighten this up in a future build.
Good Luck.
Example of the Malformed XML at the end of the configuration file. The text in Red is the extra repeated xml.
The end of the XML file should look like this:
To Stop and Restart the Hyper-V Virtual Machine Management Service from a command line, run the following two commands:
Net stop “Hyper-V Virtual Machine Management”
Net start “Hyper-V Virtual Machine Management”
March 15, 2011 at 11:09 pm
This happened with me but in my case I was unable to correct the file since it was very corrupted (the end of xml was missing) but this post helped me to understand what happened when the VM disappeared!
Be advised not to do any modification to the XML or you will ending punching yourself like I did!
Thanks for sharing your experience!
July 21, 2011 at 3:24 pm
Thanks for sharing, any idea what to do when a whole VHD disappears?
In our case, it was a domain controller that we just ended up re-creating, but I can’t think of what could happen if it was the database …
July 21, 2011 at 4:52 pm
I never ran into that. I only had this problem where the Virtual Machine disappeared. Hyper-V does seem to have its quirks though. I have been using VmWare now for several months and havent seen the same wackiness.
July 26, 2011 at 3:05 pm
Thank you for this post – I had a VM disappear during a Host update this morning. I edited the XML file as you suggested, restarted Hyper-V and there it was. You saved me 6 hours of re-creation and 2 days of synchroniztion.
July 27, 2011 at 12:46 am
No problem. Glad it was helpful. I know I was relieved when I figured that out because I had the same problem a few times before this. I am still not sure what causes the corruption in the XML file but it happens every once in a while.
November 3, 2011 at 8:33 pm
Great! You saved my night!
November 3, 2011 at 11:12 pm
Glad to help.
November 9, 2011 at 8:30 pm
Excelent post… this error was the same I had with a customer… but the corruption in the XML file was at the end, after the configuration close tag.
November 15, 2011 at 9:18 pm
I wish I was that lucky. In our case, the drive containing the VM starved (snapshot became too big) and when realized it the XML file was gone. The VM was no more.
Ended up restoring an image we had, but all customization was gone.
November 16, 2011 at 4:57 pm
Ah, sorry to hear that. Virtual machines are great things but backups are still really important. We are moving most of our virtual machines into the Amazon cloud now because they are then spread over many servers around the world so if something happens to one set of servers or one location, the servers are still ok.