Windows Server Deduplication Job Execution

While working with deduplication with volumes of around 30TB, we noticed the various job types were not executing as we were expecting. As a Microsoft MVP, I’m very fortunate to have direct access to the people with deep knowledge of the technology. A lot of the credit for this post goes to Will Gries and Ran Kalach from Microsoft who were kind enough to answer my questions as to what was going on under the hood. Here’s a summary the things I learned in the process of understanding what was going on.

Before we dive in any further, it’s important to understand the various deduplication job types as they have different resource requirements.

Job Types (source)

  • Optimization
    • This job performs both deduplication and compression of files according data deduplication policy for the volume. After initial optimization of a file, if that file is then modified and again meets the data deduplication policy threshold for optimization, the file will be optimized again.
  • Scrubbing
    • This job processes data corruptions found during data integrity validation, performs possible corruption repair, and generates a scrubbing report.
  • GarbageCollection
    • This job processes previously deleted or logically overwritten optimized content to create usable volume free space. When an optimized file is deleted or overwritten by new data, the old data in the chunk store is not deleted right away. By default, garbage collection is scheduled to run weekly. We recommend to run garbage collection only after large deletions have occurred.
  • Unoptimization
    • This job undoes deduplication on all of the optimized files on the volume. At the end of a successful unoptimization job, all of the data deduplication metadata is deleted from the volume.

Another operation that happens in the background that you need to be aware of is the reconciliation process.  This happens when the hash index doesn’t fit entirely in memory. I don’t have the details at this point in time as to what exactly this is doing but I suspect it tries to restore index coherency across multiple index partitions that were processed in memory successively during the optimization/deduplication process.

Server Memory Sizing vs Job Memory Requirements

To understand the memory requirements of the deduplication jobs running on your system, I recommend you have a look at the event with id 10240 in the Windows Event log Data Deduplication/Diagnostic. Here what it looks like for an Optimization job:

Optimization job memory requirements.

Volume C:\ClusterStorage\POOL-003-DAT-001 (\\?\Volume{<volume GUID>})
Minimum memory: 6738MB
Maximum memory: 112064MB
Minimum disk: 1024MB

Here are a few key things to consider about the job memory requirement and the host RAM sizing:

  • Memory requirements scales almost linearly with the total size of the data to dedup
    • The more data to dedup, the more entries in the hash index to keep track of
    • You need to meet at least the minimum memory requirement for the job to run for a volume
  • The more the memory on the host running deduplication, the better the performance because:
    • You can run more jobs in parallel
    • The job will run faster because
      • You can find more of the hash index in memory
        • The more index you fit in memory, the less reconciliation job will have to be performed
        • If the job fits completely in memory, the reconciliation process is not required
  • If you use throughput scheduling (which is usually recommended)
    • The deduplication engine will allocate by default 50% of the host’s memory but this is configurable
    • If you have multiple volumes to optimize, it will try to run them all in parallel
      • It will try to allocate as much memory as possible for each job to accelerate them
      • If not enough memory is available, other optimization jobs will be queued
  • If you start optimization jobs manually
    • The job is not aware of other jobs that might get started, it will try to allocate as much memory as possible to run the job potentially leaving other future jobs on hold as not enough memory is available to run them

Job Parallelism

I’ve touched a bit in the previous point about memory sizing but here’s a recap with additional information:

  • You can run multiple jobs in parallel
  • The dedup throughput scheduling engine can manage the complexity around the memory allocation for each of the volume for you
  • You need to have enough memory to at least meet the minimum memory requirement of each volume that you want to run in parallel
    • If all the memory has been allocated and you try to start a new job, it will be queued until resources become available
    • The deduplication engine tries to stick to the memory quota determined when the job was started
  • Each job in currently single threaded in Windows Server 2012 R2
    • Windows Server 2016 (currently in TP4) supports multiple threads per job, meaning multiple threads/cores can process a single volume
      • This greatly improves the throughput of optimization jobs
  • If you have multiple volumes residing on the same physical disks, it would be best to run only one job at a time for those specific disks to minimize disk thrashing

To put things into perspective, let’s look at some real world data:

Volume Min RAM (MB) Max RAM (MB) Min Disk (MB) Volume Size (TB ) Unoptimized Data Size (TB )
POOL-003-DAT-001 6 738 112 064 1 024 30 32.81
POOL-003-DAT-002 7 137 118 900 1 024 30 35.63
POOL-003-DAT-004 7 273 121 210 1024 30 35.28
POOL-003-DAT-006 4 089 67 628 1 024 2 18.53
  • To run optimization in parallel on all volumes I need at least 25.2GB of RAM
  • To avoid reconciliation while running those jobs in parallel, I would need a whopping 419.8GB of RAM
    • This might not be too bad if you have multiple nodes in your cluster with each of them running a job

Monitoring Job

To keep an eye on the deduplication jobs, here are the methods I have found so far:

  • Get-DedupJob and Get-DedupStatus will give you the state of the job as they are running
  •  Perfmon will give you information about the current throughput of the jobs currently running
    • Look at the typical physical disk counters
      • Disk Read Bytes/sec/
      • Disk Write Bytes/sec
      • Avg. Disk sec/Transfer
    • You can get an idea of the saving ratio by looking at how much data is being read and how much is being written per interval
  • Event Log Data Deduplication/Operational
    • Event ID 6153 which will give you the following pieces of information once the job has completed:
      • Job Elapsed Time
      • Job Throughput (MB/second)
  • Windows Resource Monitor (if not on Server Core or Nano)
    • Filter the list of process on fsdmhost.exe and look at the IO it’s doing on the files under the Disk tab

Important Updates

I recommend that you have the following updates installed if you are running deduplication on your system as of 2016-04-18:

  • November 2014 Update Rollup
  • KB 3094197 (Will update dedup.sys to version 6.3.9600.18049)
  • KB 3138865 (Will update dedup.sys to version 6.3.9600.18221)
  • If you are running dedup in a cluster, you should install the patches listed in KB 2920151
    • That will simplify your life with Microsoft Support 😉

Final Thoughts

Deduplication is definitely a feature that can save you quite a bit of money. While it might not fit every workload, it has its use and benefits. In one particular cluster we use to store DPM backup data, we were able to save more than 27TB (and still counting as the jobs are still running). Windows Server 2016 will bring much improved performance and who knows what the future will bring, dedup support for ReFS? Who knows!

I will try to keep this post updated as I find out more information about deduplication operationally.

Other Resources

 

 

Advertisements

3 thoughts on “Windows Server Deduplication Job Execution

  1. Hi, Is there any considerations when taking backup of dedup data for example, dedup running on file shares and fileshares require backup?

    • You have to make sure your backup software operates at the block level (typically with VSS or another filter driver that performs block change tracking) in order to efficiently backup the volume that provides the file share. For example Microsoft System Center Data Protection Manager supports the backup of deduplicated volumes. If you try to backup the content at the file level, you will pay the penalty of rehydrating the deduplicated file. This will yield to lower backup throughput as you will need to randomly retrieve the required blocks on disk in the chunkstore instead of potentially sequentially reading the data.

      I hope this helps!

      Mathieu

  2. Hi Mathieu,

    What would you advise to someone receiving the following in Eventlog when trying to launch a Scrubbing job, which obviously cannot start:

    =======
    Scrubbing job memory requirements.

    Volume: D: (\\?\Volume{SNIP}\)
    Minimum memory: 284632 MB
    Maximum memory: 284632 MB
    Minimum disk: 1024 MB
    =======

    It is “just a bit hard” to have that amount of RAM onboard… 😉

    It was much lower few days before, we added more RAM (from 8 to 32), then it jumped up to this whooping amount.
    What to do to reduce this exaggeratedly huge memory requirement?

    Also, few other useful info from Get-DedupMetadata:

    Volume : D:
    VolumeId : \\?\Volume{SNIP}\
    StoreId : {SNIP}
    DataChunkCount : 34979517
    DataContainerCount : 2493
    DataChunkAverageSize : 69.34 KB
    DataChunkMedianSize : 0 B
    DataStoreUncompactedFreespace : 0 B
    StreamMapChunkCount : 1520524
    StreamMapContainerCount : 78
    StreamMapAverageDataChunkCount :
    StreamMapMedianDataChunkCount :
    StreamMapMaxDataChunkCount :
    HotspotChunkCount : 5343
    HotspotContainerCount : 1
    HotspotMedianReferenceCount :
    CorruptionLogEntryCount : 34719043
    TotalChunkStoreSize : 2.27 TB

    CorruptionLogEntryCount is growing and there are some corrupted files on D: volume.

    No way to add physical space (more disk drives) to the volume (no more free slots).

    Any hint is more than welcome, thanks a lot in advance.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s