Compact differencing VHD or VHDX

Hi,

Is it possible to compact a differencing disk similar to how a dynamically expanding VHD can be compacted?

There's this technet article for virtual server, which says it doesn't work for differencing disks: "If you want to compact a differencing virtual hard disk or an undo disk, you must merge the changes to the parent disk and then compact the parent disk, if it is a dynamically expanding virtual hard disk"

However, this article about a C++ API clearly says it's possible: "Compaction can be run only on a virtual disk that is dynamically expandable or differencing."

I've tried to do it with the powershell cmdlet Optimize-VHD. It (kind of) implies that differencing VHDs work too: "Optimizes the allocation of space used by virtual hard disk files, except for fixed virtual hard disks."

The results from my tests are strange. It works somehow, but definitely not how I expected it to work. I did the following:

First, create a new dynamic VHD:

NEW-VHD -Dynamic C:\VHD\BaseDrive.vhd -SizeBytes 3GB

Mount it:

Mount-VHD -path .\BaseDrive.vhd

Open Disk Management. It prompts for initializing the mounted drive.
Confirm (MBR). Format as NTFS and Assign Drive Letter E.

Write a file to the disk: test.zip: 296 MB

Calls SDelete to overwrite empty blocks with zeros.
sdelete.exe -z E:

Dismount-VHD -path .\BaseDrive.vhd

Create a differencing VHD and mount it:

New-VHD -Differencing -Path .\DiffDisk.VHD -ParentPath .\BAseDrive.VHD
Mount-VHD -path .\DiffDisk.VHD

Copy some files to E:. The file size of DiffDisk.VHD increases to 567 MB. Now delete all files you just added and call sdelete again with the same parameters. The file size did not change (still 567 MB).

Now unmount and call Optimize-VHD

Dismount-VHD -path .\DiffDisk.vhd
Optimize-VHD -path .\DiffDisk.VHD -Mode Full

The expected result would be a file size close to 0 for DiffDisk.VHD. However, it's now 22 MB. Mounting it still works correctly, but I don't understand why it's still 22 MB. In this case, the improvement is still quite good, but it's just a test case and in the real world case, there's no improvement, unfortunately (more like the opposite). Somehow sdelete even increases the size from about 1.5 GB to 6 GB for a 80 GB VHD and then Optimize-VHD reduces it by several hundred MB again to 5.8... or something like that.

My understanding is that sdelete should overwrite all empty blocks on the disk with zero, including those where the deleted files were stored. Optimize-VHD should be able to completely remove all these blocks. I have no idea where these 22 MB come from.

Inputs, explanations or solutions are very welcome!



  • Edited by Bef 11 hours 41 minutes ago
March 20th, 2015 2:45pm

The differencing disk holds the changes made, there will always be a size on it, you can't compact it to 0.

You wrote changes to the diff disk, size increase, you deleted files, there's still going to be some kind of disk footprint for that activity. Swap file, timestamp modifications, etc.

The only way you're going to get 0, would be to delete the diff disk, since that would mean there have been zero changes to the parent disk.

Free Windows Admin Tool Kit Click here and download it now
March 20th, 2015 4:19pm

"compacting" does not necessarily reduce the space consumed on storage.

It can, but only if the underlying storage system supports TRIM.

Also, you have the impacts of what is inside of the VHD vs. its storage allocation on disk.

And, dynamic virtual disks (and differencing disks) do not auto shrink when stuff is deleted.

Also, you cannot reduce the size or change the size of a differencing disk due to the dependency on its parent (it has to resemble its parent), and that it is a wrapper around a block layout not a bunch of files.  It is this block layout (just like any physical disk) that complicates things greatly.

Over time, differencing disks have the potential to consume up to their maximum expansion size - thus when you have the parent plus the child you have the potential to consume more storage than the parent alone.

If you want to guarantee to recover the space - merge the differencing disk to the parent and then shrink that disk.

Or Export the entire machine to a new VHD.

And, I have been told that if you perform a storage migration of the virtual disks (with Hyper-V 2012 R2) the underlying copy mechanism actually optimizes the disk layout - and you may regain some space by optimizing after that.

It really depends on the intent of what you are attempting to do.

March 20th, 2015 4:22pm

The differencing disk holds the changes made, there will always be a size on it, you can't compact it to 0.

You wrote changes to the diff disk, size increase, you deleted files, there's still going to be some kind of disk footprint for that activity. Swap file, timestamp modifications, etc.

I wrote "close to 0". In my example, there's just one file (test.zip) on the base disk and then I add a bunch of files (or just one file) to the diff-disk. There's no page-file on the mapped VHD and even if there's a "last accessed" timestamp or something on the drive or the single file in it, that should not be 22 MB. I have tried a powershell script to do the same as sdelete and it ends up with 16 MB. So it might be related to the zeroing somehow

Anyway, the bigger concern than this example is the other test I did with the real disk (a Windows 7 system disk). There, SDelete makes the diff larger by multiple GB and then Optimize-VHD isn't able to shrink that zeroed space (or only a very tiny bit of it).

Maybe I should give some background information, why this is important for me. We're using XenClient at work, which uses differencing disks for distributing updates to clients. These Differencing disks need to be downloaded by all clients across potentially slow network connections. Now just a naked and optimized Windows 7 install (swap disabled, defrag disabled, ready-boot disabled, super-fetch disabled, system-restore disabled etc...) produces a diff-disk of around 700 MB when you just start it and leave it on the logon prompt. Also there are a lot of patching mechanisms that download sources and extract them before installing. The zipped download and the extracted sources can be deleted, but they still add to the update size on the block-level. It would be nice to strip this kind of temporary data from the differencing VHD before compressing it and distributing it to clients.

Free Windows Admin Tool Kit Click here and download it now
March 20th, 2015 4:47pm

It really depends on the intent of what you are attempting to do
March 20th, 2015 4:56pm

I have not worked with XenClient in any detail.

It uses compression and have various deployment options outlined in the admin guide to deal with slow links.  But I assume that you are following that already.

And I am guessing that you have contacted support for assistance or the forums.

VHD has some limitations.  And this is one of them.

But you are specifically dealing with differencing disks as well.  Which is different than dealing with a blank VHD.  The real key with differencing disks and keeping them small is to add data, not replace data. 

And the file you write ends up consuming an entire block, no matter how small.  So it you replace a 1GB file with a 50MB file, you don't actually recover the difference.  Due to a VHD being a block layout.  You still consume the entire block.

It is far easier to optimize VHDs as you can just rearrange and compress the files.  You can't just do that with a differencing disk.

Free Windows Admin Tool Kit Click here and download it now
March 20th, 2015 5:16pm

I've red all the documentation I could find, even the VHD file format specification ;). I think I understand the problem now because of your example.

Block size is 2 MB by default and also in my test-case. Inside the blocks, there are sectors of 512 Bytes.
Replacing a 1 GB file with a 50 MB file would result in 500-25=475 blocks to become free and 25 blocks added to the vhd diff (1GB = 1000 MB here to simplify it a bit). The 475 blocks will only be marked as free blocks first, but no data will be written there. When calling sdelete or something similar, the blocks are zeroed, which means 475 blocks containing only zeros are added to the diff disk and the diff disk increases in size. Optimize-VHD can not remove these 475 zeroed blocks, because that would change the final disk (combination of base and diff) and has to keep them.

This is an explanation why the diff disk size increased after zeroing and why Optimize-VHD only helped a little bit. Now to get the complete picture, I should probably take compression into account. Zeroed blocks should compress quite well.

Still, I think it would be technically possible and not too difficult to loop over all blocks in the diff-disk and remove those blocks that were zeroed in the base disk. Unfortunately I don't know how to do that with existing tools and I will not spend days of coding to develop a potentially fragile optimization.

I have requested opening a Citrix support case for this problem last Tuesday btw. In larger companies It's sometimes surprisingly difficult to explain why spending a bit of money to save a lot of money is a good idea. ;)

Another thing I'm starting to understand is why the Windows Performance Analyzer has shown around 40 MB of disk writes when the diff disk increased by around 600 MB. This might be due to the block-level differences. Writes to 350 different blocks would be 700 MB if the whole block has to be added to the diff disk every time. I should read the VHD spec again... but this would mean that writes to small files have a very bad effect on the size of the diff disk. Maybe reducing the block size could help? I'll try that too.

March 20th, 2015 6:19pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics