Hello,
l have a 2 node Win 2012 R2 Hyper-V RDS Cluster hosting Pooled Virtual Desktops. The Cluster Shared Volume is presented directly from the EMC Fibre SAN to the Hyper-V hosts. I want to enable Deduplication on the CSV. All of
the documentation suggests not doing this directly on the Hyper-V hosts, but creating a SOFS Cluster that presents the deduped disk to the Hyper-V hosts. This will add a level of complexity to the environment that I would like to avoid. Is
the only reason for this possible performance issues? Can I enable dedupe directly on the Hyper-V hosts and the attached CSV volume using the powershell command Enable-DedupVolume C:\ClusterStorage\Volume1 UsageType HyperV? What are the ramifications?
Thank you.
1) Microsoft dedupe plays nice with cold data (mostly read-intensive workloads, OK...) and does not work well with write-intensive content. See:
Plan to deploy data deduplication
https://technet.microsoft.com/en-us/library/hh831700.aspx
"Does the data access pattern allow for sufficient time for deduplication?
Files that change often and are constantly accessed by users or applications are not good candidates for deduplication.
The constant access and change to the data are likely to cancel any optimization gains made by deduplication, and deduplication may not be able to process the files.
- A good candidate for deduplication is a file share that hosts user documents, virtual files, or software deployment
files that contain data that is modified infrequently and read frequently.
- Poor candidates for deduplication are a constantly-mounted SQL Server database that is running virtual machines, and live Exchange Server databases.
Good candidates allow time to deduplicate the files. File age policies can be applied to control when files are deduplicated
to prevent early or frequent deduplication of files that are still likely to be modified significantly."
2) Run PerfMon and see how much writes do you do on your current CSV compared to reads. If reads are 80% you can enable dedupe on CSV just fine! If more (critical is maybe 40%) you'll be in trouble as you'll waste your IOPS for nothing :( See how to work
with PerfMon:
Windows PerfMon Counters Explained
http://blogs.technet.com/b/askcore/archive/2012/03/16/windows-performance-monitor-disk-counters-explained.aspx
3) Make sure you run MSFT dedupe verification tool and actually get how much space you'll save. Do this before doing anything else :) See:
Evaluate dedupe savings with DDPEVAL
http://blogs.technet.com/b/klince/archive/2012/08/09/evaluate-savings-with-the-deduplication-evaluation-tool-ddpeval-exe.aspx
You can easily find the juice does not cost the squeeze :) Even with uber-expensive EMC SAN disk space.
Good luck!