SHA-1 Hash of an Entire Folder Structure

Can anyone recommend a program / utility that I can point to a folder and it generates a SHA-1 hash of all contents?

I'm familiar with using Hashcalc which works very well with single files... a fallback option I have is to take a zip archive the aforementioned folder and then hash the archive... but I'm hoping to not have to do that.

Kind regards, Dave

June 7th, 2010 5:13pm

how much large is folder content?
Free Windows Admin Tool Kit Click here and download it now
June 8th, 2010 6:56am

Hi,

It's only about 4MB... but it consists of about 100 files in about twenty folders.

Regards, Dave

June 8th, 2010 7:04am

would PowerShell script fine for you?
Free Windows Admin Tool Kit Click here and download it now
June 8th, 2010 7:12am

Hi,

I have written a small console program that compute the hash of the content of a directory. It will explore it recursively and it uses the lexicographical order on names for sorting its content before performing the hash computation.
I have implemented SHA-1, SHA-256, SHA-384 and SHA-512.

It can be called as follows : DirHash.exe Path [Algorithm] . The second parameter is optional: by defaut, SHA-1 is used but you can specify other hash algorithms by setting the second parameter to SHA256, SHA384 or SHA512.

You can get the binary using the following link : http://www.idrix.fr/Root/Samples/DirHash.zip

I hope this will help.
Cheers
--
Mounir IDRASSI
IDRIX
http://www.idrix.fr

  • Marked as answer by Chip Eater Wednesday, June 09, 2010 7:49 PM
June 8th, 2010 2:56pm

Vadims,

Yes, PowerShell would be an excellent solution.

Regards, Dave

Free Windows Admin Tool Kit Click here and download it now
June 8th, 2010 5:48pm

Mounir,

Many thanks - I've had a very quick look at your utility and it seems very simple (that's a good thing) in execution.  I'll do a little further testing but I think it does exactly what I want.

Kind regards, Dave

June 8th, 2010 5:50pm

How about fciv.exe (http://support.microsoft.com/kb/841290)?
Free Windows Admin Tool Kit Click here and download it now
June 9th, 2010 3:08pm

Hello Chip!

Sorry for my late response. Here is PowerShell example:

function Get-FolderHash ($folder) {
 dir $folder -Recurse | ?{!$_.psiscontainer} | %{[Byte[]]$contents += [System.IO.File]::ReadAllBytes($_.fullname)}
 $hasher = [System.Security.Cryptography.SHA1]::Create()
 [string]::Join("",$($hasher.ComputeHash($contents) | %{"{0:x2}" -f $_}))
}

copy and paste this code to PowerShell console and type:

Get-FolderHash "C:\CustomFolder"

where C:\CustomFolder is your folder (and subfolders) against which hash is computed.

Martin: as far as I remember, FCIV cannot create single hash for multiple files. FCIV just creates a single hash for each file and writes it to XML

June 9th, 2010 3:24pm

Many thanks Vadims.

I just tested the script (on Windows 7 x86) and whilst it seemed to run OK there was no output generated... I'm not sure what I should be expecting (as in does it output the hash to the console)?

Forgive me my ignorance regarding PowerShell!  I tried running this as you said by copying and pasting into the PowerShell prompt, I also tried by saving the text to a file called Get-FolderHash.ps1 and executing this followed by the folder name.  I really do apologise for being a PowerShell novice... what is the easiest way to run this as a "batch file"?

Kind regards, Dave

Free Windows Admin Tool Kit Click here and download it now
June 9th, 2010 8:08pm

Martin: as far as I remember, FCIV cannot create single hash for multiple files. FCIV just creates a single hash for each file and writes it to XML

June 10th, 2010 6:20am

open PowerShell console and type:

New-Item $Profile -ItemType File -force
notepad $profile

in opened notepad window copy and paste this function. Save notepad document. Re-open PowerShell console. Each time you want to calculate the hash for a folder you need to type only the following command:

Get-FolderHash "C:\CustomFolder"

where C:\CustomFolder is your folder (and subfolders) against which hash is computed.

When you saved the code into PS1 file and executed you just loaded a function. To get an output, you need to call this function.

Free Windows Admin Tool Kit Click here and download it now
June 10th, 2010 11:23am

Vadims, I've done as instructed, but get the following results...
######################################################################## PS C:\> New-Item $Profile -ItemType File -force
    Directory: C:\Users\me\Documents\WindowsPowerShell
Mode                LastWriteTime     Length Name ----                -------------     ------ ---- -a---        10/06/2010     17:59          0 Microsoft.PowerShell_profile.ps1
PS C:\> notepad $profile  >>>>>>>> I then pasted in the code and saved it to the default location <<<<<<<<<<<<<
PS C:\> get-folderhash "C:\Data" The term 'get-folderhash' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spelling of the name, or if a path was included, verify that the path is correct and try again. At line:1 char:15 + get-folderhash <<<<  "C:\Data"     + CategoryInfo          : ObjectNotFound: (get-folderhash:String) [], CommandNotFoundException     + FullyQualifiedErrorId : CommandNotFoundException
##############################################################################
I do apologise for being very weak on PowerShell, but this all seems very messy; is there really no easy way to just run a "script" and pass the folder as an "argument"?
Kind regards, Dave
June 10th, 2010 5:11pm

If you have saved the code above into Microsoft.PowerShell_profile.ps1 file, you need to reopen PowerShell console.

Free Windows Admin Tool Kit Click here and download it now
June 10th, 2010 8:17pm

Got it... finally!

Thanks for your patience, Dave

June 11th, 2010 7:16am

Hi Mounir IDRASSI,

Thanks for this tool. This is exactly what I was looking for.

I visited your web site and found that you provide source code for some tools as well.

Could you please provide source for this tool as well? I would like to integrate this approach in one of my projects.

Thanks,

Sourabh Bhandari

Free Windows Admin Tool Kit Click here and download it now
December 22nd, 2010 10:58am

Hi,

I have put the source file at the following link : http://www.idrix.fr/Root/Samples/DirHash.cpp .
It uses OpenSSL for the hashing algorithms (especially SHA-256, SHA-384 and SHA-512) in order to support all version of Windows (from Windows 2000) without relying on the presence of a specific CSP.

I have also put the full VC 2008 project that includes OpenSSL headers and its static library. You can get it here :
http://www.idrix.fr/Root/Samples/DirHash_src.zip

I hope this will help.
Cheers,
--
Mounir IDRASSI
IDRIX
http://www.idrix.fr

December 22nd, 2010 12:20pm

Hi  Mounir,

Many thanks for the source code.

This is exactly what I was looking for.

Best regards,

-Sourabh Bhandari

Free Windows Admin Tool Kit Click here and download it now
December 23rd, 2010 6:12am

And now for a version which is 600 times faster (on my machine).  

On an 8MB folder:

Old version  ~20 seconds

New Version: ~40 ms

Function Get-FolderHash
{
    param ($folder)
    
    Write-GridLog "Calculating hash of $folder"
    $files = dir $folder -Recurse |? { -not $_.psiscontainer }
    
    $allBytes = new-object System.Collections.Generic.List[byte]
    foreach ($file in $files)
    {
        $allBytes.AddRange([System.IO.File]::ReadAllBytes($file.FullName))
        $allBytes.AddRange([System.Text.Encoding]::UTF8.GetBytes($file.Name))
    }
    $hasher = [System.Security.Cryptography.MD5]::Create()
    $ret = [string]::Join("",$($hasher.ComputeHash($allBytes.ToArray()) | %{"{0:x2}" -f $_}))
    Write-GridLog "hash of $folder is $ret."
    return $ret
}
January 2nd, 2015 8:05pm

Is there a version available if size of folder exceeds 2GB?

Thx

Free Windows Admin Tool Kit Click here and download it now
July 14th, 2015 6:42pm

Is there any particular problem when the folder size exceeds 2GB?
Is your question about performance?

Anyway, with my tool DirHash there is no issue with big folders: on a Core i7 2600K, it takes 12 seconds to hash 2.2 GB using SHA-1.

July 14th, 2015 7:11pm

Sorry, i have only the german errors:Ausnahme beim Aufrufen von "AddRange" mit 1 Argument(en):  "Das Zielarray ist nicht lang genug. berprfen Sie
destIndex, die Lnge und die Untergrenze des Arrays."
In Zeile:10 Zeichen:9
+         $allBytes.AddRange([System.IO.File]::ReadAllBytes($file.FullName))
+         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [], MethodInvocationException
    + FullyQualifiedErrorId : ArgumentException

I have output $allBytes.Count and if exceeds about 2107754211 bytes the error appear.

I would like to use DirHash, but it has two small disadvantages for me:
- 0-bytes files be ignored for hashcode
- renaming file (no change with content) and order is the same, no change in hashcode

Free Windows Admin Tool Kit Click here and download it now
July 14th, 2015 7:37pm

The shell script posted above is not practical because it reads all the content of the files into memory before doing the hash. This can't work in real life.

I have modified my tool DirHash by adding a new switch (-hashnames) that will activate the use of names of files and directories in the hash computation. This way, even empty files will be included.

I have also included MD5 support for faster processing. It is not as secure as SHA counterparts but it does the job is most cases.

You can download the binary from the same URL. I have also setup a github page where binary and sources can also be download and it contains also a small explanation of its usage: https://idrassi.github.io/DirHash/

Voila voila...I hope this will

July 15th, 2015 1:41am

Thx, works perfectly!

------------------------------

Thanks again Mounir for the extension!

It's possible to extend with an additional feature  ;-)
Want to exclude some files (*.log) from hashing, for example
switch "-exclude=*.log"

Thx

  • Edited by msfan63 7 hours 9 minutes ago
Free Windows Admin Tool Kit Click here and download it now
July 15th, 2015 5:08pm

Thx, works perfectly!

------------------------------

Thanks again Mounir for the extension!

It's possible to extend with an additional feature  ;-)
Want to exclude some files (*.log) from hashing, for example
switch "-exclude=*.log"

Thx

  • Edited by msfan63 Wednesday, July 15, 2015 11:52 PM
July 15th, 2015 8:57pm

starting with PowerShell 4, this tool is no longer necessary. Instead, it is recommended to use Get-FileHash cmdlet. For example, to hash current folder only:

dir | Get-FileHash

current folder and subfolders:

dir -recurse | Get-FileHash

exclude *.log files:

dir -recurse -exclude *.log | Get-FileHash

Note, default hashing algorithm is SHA256. You can use any of: MD5, SHA1, SHA256 (default), SHA384, SHA512, MACTripleDES, RIPEMD160:

dir -recurse -exclude *.log | Get-FileHash -Algorithm SHA512

more details:

Get-Help Get-FileHash

Free Windows Admin Tool Kit Click here and download it now
July 16th, 2015 1:30am

You are right, but it's an older Windows 2003 Server.

Can Get-FileHash also compute one "Summmary"-hash for the complete folder?

July 16th, 2015 8:15am

> Can Get-FileHash also compute one "Summmary"-hash for the complete folder?

no, and it doesn't make sense, as it:

1) depends on sort order

2) do not provide information which file was changed.

In other words, it is not a trustworthy information.

Free Windows Admin Tool Kit Click here and download it now
July 16th, 2015 8:32am

>>
>> In other words, it is not a trustworthy information.
>>

I think it depends. In my case i download a database with many pictures
(export from our serviceprovider) automatically every night and i want only
import it to my database if any file changes. Only full import is possible.

Dirhash computes "Summary-"hash every time with same sort order. It's
easier for me to compare only one hashcode.

July 16th, 2015 8:45am

apparently, it is not very common scenario, so you may need to use dirhash or other 3rd party tools.
Free Windows Admin Tool Kit Click here and download it now
July 16th, 2015 9:02am

I have implemented the "-exclude" switch proposed by msfan63. The syntax is "-exclude pattern" where pattern is in the form *.extensionName or *suffix. For example, use "-exclude *.log" to exclude .log files.

This switch also works for directories. For example, to exclude the directory "temp" and all its content from the computation, you can specify "-exclude *temp".

There can be several "-exclude" switches in the command line in order to exclude several name patterns.

You can get the new binary at the same URL.

I also now make available a 64-bit of DirHash which can be obtained at https://idrassi.github.io/DirHash/. This boost performance on Windows 64-bit machines significantly.

Last remark: directory fingerprinting is very useful in many practical cases and I don't see why one would calculate the hash of the directory if there is no guarantee that the value is not uniquely tide to the content.
That's why DirHash implements a recursive lexicographical ordering for files and directories in order to ensure a unique representation of the directory content.

July 18th, 2015 8:46pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics