Compare directories

I would like to compare 2 directories which are not equal but should contain the same files in a different folder structure.

I use robocopy with a csv file to copy files to another folder structure.

Although most of the files are copied to another folder, so simple comparing 2 folder structures is not possible.


Is there a way I can use to check if all files on the source location are found in the folder structure on the destination structure ?

 

February 26th, 2015 6:29am

I think the command you are looking for is:

Get-ChildItem -Path C:\folder\ -recurse

This will give you a flat listing of files throughout the entire folder structure. You could then compare that with the source directory or the CSV file you fed into RoboCopy. PowerShell has great native support for reading/writing CSV format.

Check out my blog here for an intro to working with CSV files -> http://blog.sharepoint-voodoo.net/?p=170

-Corey

Free Windows Admin Tool Kit Click here and download it now
February 26th, 2015 6:50am

Hi,

This will do what you want, maybe not the most efficient way for very large file and folder structures though...

$path1 = "C:\Dir\1"

$path2 = "C:\Dir\2"

$Files1 = Get-ChildItem -path $path1 -Recurse -File

$Files2 = Get-ChildItem -path $path2 -Recurse -File

foreach ($file1 in $files1) {

foreach ($file2 in $files2) {  

If ($file1.Name -eq $file2.Name) {

   break

} else {

Write-Host $file1.DirectoryName "\" $file1.Name " not found in" $path2

}

}

}



  • Edited by Basty_ss 23 hours 35 minutes ago
February 26th, 2015 6:57am

I would prob do something slightly diffrent

$path1 = "C:\Dir\1"

$path2 = "C:\Dir\2"

$Files1 = Get-ChildItem -path $path1 -Recurse -File

$Files2 = Get-ChildItem -path $path2 -Recurse -File

Compare-Object $files1 $Files2

You will get an output like:


InputObject                                                 SideIndicator
-----------                                                     -------------

Filename     => or <=

The => or <= denotes if the file is different within either the reference set (<=) or the Difference set (=>). If you would like to see what they have incommon too you can add the -Includeequal flag. This will produce a (==) for shared files.

You can make this into one command too:

compare-object (get-childitem -path "c:\DIR\1")  (get-childitem -path "c:\DIR\2")



Free Windows Admin Tool Kit Click here and download it now
February 26th, 2015 7:11am

Just check the RoboCopy log.  It it shows not files skipped then you have compared the folders.  That is why RoboCopy has such a detailed log summary.

Checking with script is like buying a cat then walking to work just to be sure you get there.

February 26th, 2015 11:13am

Hi,

This will do what you want, maybe not the most efficient way for very large file and folder structures though...

$path1 = "C:\Dir\1"

$path2 = "C:\Dir\2"

$Files1 = Get-ChildItem -path $path1 -Recurse -File

$Files2 = Get-ChildItem -path $path2 -Recurse -File

foreach ($file1 in $files1) {

foreach ($file2 in $files2) {  

If ($file1.Name -eq $file2.Name) {

   break

} else {

Write-Host $file1.DirectoryName "\" $file1.Name " not found in" $path2

}

}

}



  • Edited by Basty_ss Thursday, February 26, 2015 11:58 AM
Free Windows Admin Tool Kit Click here and download it now
February 26th, 2015 11:56am

I would prob do something slightly diffrent

$path1 = "C:\Dir\1"

$path2 = "C:\Dir\2"

$Files1 = Get-ChildItem -path $path1 -Recurse -File

$Files2 = Get-ChildItem -path $path2 -Recurse -File

Compare-Object $files1 $Files2

You will get an output like:


InputObject                                                 SideIndicator
-----------                                                     -------------

Filename     => or <=

The => or <= denotes if the file is different within either the reference set (<=) or the Difference set (=>). If you would like to see what they have incommon too you can add the -Includeequal flag. This will produce a (==) for shared files.

You can make this into one command too:

compare-object (get-childitem -path "c:\DIR\1")  (get-childitem -path "c:\DIR\2")



February 26th, 2015 12:11pm

The problem is that I use a csv file which defines the source and destination location for all folders and files.

If one ore more folders are forgotten in the csv file it will not copy all necessary content.

The robocopy log file will show the results that everything has been copied, but I am still missing content.

So I need to verify if all files from the source location exist somewhere in the folder structure of the destination location.

I try to do it with the suggested command: Get-ChildItem -Path C:\folder\ -recurse

But it doesn't seem to be able to deal with the 260 character limitation, is there a workaround for this ?


  • Edited by Piet200 Wednesday, March 04, 2015 10:14 AM
Free Windows Admin Tool Kit Click here and download it now
March 4th, 2015 8:50am

What you are claiming is technically impossible or you have set the options in RoboCopy incorrectly. What you are trying to do cannot be done due to technical restrictions on the PowerShell CmdLets.  Please take the time to set the RoboCopy options and test the outcome.

Mistakes with extremely long pathnames are very common because we have so few tools to work with files that have long names.

The workarounds for this limitation are - use RoboCopy,  Write a custom program in a compiled language. map a drive to an incremental path.

March 4th, 2015 11:20am

I understand, but trying to build some checks to easily verify if a copy job has done it's work successfully.

Now I only test the script with 1 folder including subfolders and files, but there is much more to come.

in total many TB's and millions of files.

Currently I export the source folder structure to a csv file and then manually check each folder to which destination folder is should be copied to.

The outcome is a csv file which many source and destination folders.

So many files get a different destination folder.

A mistake is very easily made, so I need to build in some checks, besides the robocopy log file, So I can check that all files were included during the robocopy execution.


One of the most important checks should be that all files from the source location should be somewhere on the destination location. (only in a different folder)

When I check with the windows explorer to see the properties of a folder it counts the total size and the number of files.
In this case there is a mismatch in the number of files and I would like to easily find which files are not copied.

Free Windows Admin Tool Kit Click here and download it now
March 4th, 2015 1:03pm

You cannot manage log pats with PowerShell CmdLets. They cannot do this. You can map a drive to part of the path to shorten it.

New-PsDrive short1: -Root 'c:\folder\my long path name'

HELP New-PsDrive -Full

March 4th, 2015 1:24pm

Thank you for your comments, this might be usefull for issues with long file paths.

But how about comparing source and destination location to verify if all files are copied ?

Free Windows Admin Tool Kit Click here and download it now
March 4th, 2015 1:40pm

Thank you for your comments, this might be usefull for issues with long file paths.

But how about comparing source and destination location to verify if all files are copied ?


the question I have is this: are all of the filenames unique? If not, and you have some identically named files in different folders in the source location (and then probably also in the destination location) the problem is compounded. Your script would then need to compare individual files in at least all those cases where the filenames are not unique. No matter how efficiently this is done, the impact when looking at millions of files occupying TB's of storage is going to be significant.
March 4th, 2015 1:40pm

That question was answered above a long time ago.  If it doesn't work then you are out of luck.

Free Windows Admin Tool Kit Click here and download it now
March 4th, 2015 1:42pm

For few files (<1000) i would try MD5 check for both directories, group the results ... if the count -lt "2"  then the file isnt copied..  but for TBs and milions of files.. well ..
March 4th, 2015 2:58pm

You are absolutely right when comparing all files in all TB's at once. most likely there will be duplicate filenames in all these files.

But I want to split it up, looking at the root of the folder, there are about 2000 directories which I want to compare separately. this reduces the change of duplicate files. It is not impossible, but remember that I would like to have an indication that all files are copied.

Now I use the following lines to get a list of all files on source and destination location:

Get-ChildItem G:\source -Recurse -Attributes !Directory+!System | Select-Object Name | export-csv c:\source-files.csv
Get-ChildItem H:\destination -Recurse -Attributes !Directory+!System | Select-Object Name | export-csv c:\destination-files.csv

This export gives only file names as result. Now I would like to compare source and destination exported csv's to find the differences. How could I do that ?

Free Windows Admin Tool Kit Click here and download it now
March 5th, 2015 3:01am

Just compare them:

$f1=Get-ChildItem G:\source -Recurse -file
$f2=Get-ChildItem H:\destination -Recurse -file 

Compare $f1 $f2 -Property Name -passthru

March 5th, 2015 4:57am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics