File duplicates finder

February 4, 2008

A month ago I’ve built a small application with a set of Microsoft SQL Server Reporting Services reports that inventories files on a specific path on a file system or in SharePoint and stores its hash and location in a table. With this you can identify duplicates between a file server and SharePoint sites as well as get storage statistics.  

I’ll be publishing the solution on CodePlex at the following URL:

 http://www.codeplex.com/fdf

 The project was developped in .Net 3.5  using technologies such as Windows Communication Foundation and Parallel FX for multi-threaded file hashing.

Please note that this is a early version and has not been fully tested.  I ran both SharePoint and file system scans on about 130000 files without apparent issues.

 Enjoy!