Windows 8 DeDuplication = Awesome

I do really love my Asus Zenbook, it’s the fastest lightest laptop I’ve ever had. It powers on nearly instantly, especially with Windows 8, the battery life is phenomenal. However even the maxed out version only comes with 4GB of RAM and a 256GB SSD. So sometimes I just need more space. So I decided to see what the data deduplication would get me. A large number of the files I keep on my 2nd partition are ISO files and lab files for various things.  Plus usually have a few Hyper-V VM’s as well.

I have read various forum posts, but didn’t see any comprehensive guides on how to put everything together.  Until I was nearly finished with my write-up and discovered fellow MVP The Wei King’s post, doh! Wish I had found that sooner, regardless, his post has an excellent walkthrough and I’ve provided my own walkthrough with some additional details below.

My disk prior to dedup:

image

My Hyper-V folder alone (on D:) was 28GB prior to the dedup.

My disk after dedup:

image

Running a Get-dedupvolume shows the results of dedup on my D:, pretty awesome!

image

Configuring DeDup

You will need the following files before you can proceed. I’ll leave it up to you to track down the files or grab them from a Server 2012 installation.

image

From within the directory the files are located, from an elevated PowerShell prompt, run the following 2 commands.

dism /online /add-package /packagepath:Microsoft-Windows-VdsInterop-Package~31bf3856ad364e35~amd64~~6.2.9200.16384.cab /packagepath:Microsoft-Windows-VdsInterop-Package~31bf3856ad364e35~amd64~en-US~6.2.9200.16384.cab /packagepath:Microsoft-Windows-FileServer-Package~31bf3856ad364e35~amd64~~6.2.9200.16384.cab /packagepath:Microsoft-Windows-FileServer-Package~31bf3856ad364e35~amd64~en-US~6.2.9200.16384.cab /packagepath:Microsoft-Windows-Dedup-Package~31bf3856ad364e35~amd64~~6.2.9200.16384.cab /packagepath:Microsoft-Windows-Dedup-Package~31bf3856ad364e35~amd64~en-US~6.2.9200.16384.cab

dism /online /enable-feature /featurename:Dedup-Core /all

SNAGHTMLa470bf

Once you run those commands, when you view Programs and Features, you will now see the File Server Role, and Data Deduplication enabled and listed.

SNAGHTMLfa66d2

Analyzing Drive Space

Once the feature has been added, you can use ddpeval.exe to look at a folder, or entire drive to see how much savings you would get with the feature.

For example, I looked at my Hyper-V folder to see what it would save me. My Hyper-V folder is about 28GB right now, not much in there, just a couple of VM’s, but with dedup enabled, it would only take up 14.5GB, saving nearly 50%!

image

Analyzing my entire D: results in the follow results.

image

Commands to configure DeDuplication

Additional details can be found here on TechNet

The default policy settings for Server 2012 are as follows:

  • Process files that have a minimum age of five days according to the Last Modified Time. If Last Access Time is enabled on the server (this is not the default setting), deduplication will use the Last Access Time.
  • Process files in background mode every hour. In background mode, the system uses up to 25% of the system memory during optimization jobs, whereas manual Throughput jobs use up to 50% of the system memory.
  • Do not exclude any directories or file types. The default setting is to process the entire volume.
  • Run a garbage collection job every Saturday at 1:45 AM. Garbage collection reclaims space on a volume by deleting chunks from the chunk store that are no longer referenced. Garbage collection compacts a container only if approximately 50 MB of chunks exist that have no references. Every fourth run of garbage collection incorporates the -full parameter, which instructs the job to reclaim all available space and maximize all container compaction.
  • Run a data scrubbing job every Saturday at 2:45 AM. Scrubbing jobs verify data integrity and automatically attempt to repair corruptions that are found.

The default settings will configure 3 schedules, Optimization runs every hour, Garbage Collection and Scrubbing are set for once a week.  These can be viewed from PowerShell using “get-dedupschedule”.

The following commands are all done from within a Windows PowerShell prompt.

Enable on a volume:

Enable-DedupVolume D:

Configure number of days before dedup is done:

Set-Dedupvolume D: -minimumfileagedays 30

Return a list of volumes that have been enabled for dedup:

Get-DedupVolume

To Start an Optimization Job before the default schedule:

Start-DedupJob – Volume D: -Type Optimization

To view the progress of a optimization job:

Get-Dedupjob

Primarily to get started you need to configure the job for a volume, then start the optimization, or wait for the scheduled job, and then it can be viewed using the get-dedupjob command.

Additional Considerations

  • The Dedup process can process data at roughly 2TB per 24-hour period, about 100 GB per hour. However CPU, Disk, I/O can affect that obviously.
  • Running out of free space can be a bad thing when you are using dedup.  Keep an eye on your free space at all times.

email

Written by , Posted .