How do I find many keys with the same value in a Hash Table?

Hi,

I have 2 Hash Tables and 99% of the time the condition is the same key will appear twice in each table.  The first thing I do is remove the dup key in the 2nd table.  Then, I need to find if the value associated with the key I deleted exists in the 2nd table, and if so, how many times does it occur and what keys are associated with it.  In practice, these will be large Hash Tables, so I need to quickly find out how many occurences of the key exist in the 2nd table and what Keys are associated with it.

I try here to remove a key while the value still exists in the 2nd table, but this isnt working.  Also, Im not correctly displaying the data from the 2nd table as well when I find it.  I'd appreciate some help.

$EmpNames = $null
$EmpNames = @{}
$EmpNumbers = $null
$EmpNumbers = @{}

$EmpNames = @{John = "John"; Dave = 223344; Justine = "Justine"; Rose = 223344; "Jim" = 223344}
$EmpNumbers = @{John Doe = 112233; Dave Davis = 223344; Justine Smith = 334455; Rose Jones = 223344;"Jim Bob" = 223344;"Jim" = 223344; Dave Davis1 = 223344}

Foreach ($item in $EmpNames.GetEnumerator()) {
 
   If ($EmpNumbers.ContainsKey($item.Name) -eq $True) {
       $EmpNumbers.Remove($item.Name)
     
            Do
                  {
                  
                   $EmpNumbers.Remove($item.Name)
                   Write-Host "$EmpNumbers.($Item.Name) $EmpNumbers.$Item.Value Found"
                 
                  } while ($EmpNumbers.ContainsValue($item.Value) -eq $True)            
      }      
    Else
     {
      Write-Host "Dup $($item.Value) Not Found"
     }
 }        &

August 28th, 2015 10:02pm

A hash table cannot possible have duplicate keys.

PROOF:

PS C:\scripts> $x=@{}
PS C:\scripts> $x.Add('K1','44')
PS C:\scripts> $x.Add('K1','44')
Exception calling "Add" with "2" argument(s): "Item has already been added. Key in dictionary: 'K1'  Key being added: 'K1'"
At line:1 char:1
+ $x.Add('K1','44')
+ ~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [], MethodInvocationException
    + FullyQualifiedErrorId : ArgumentException

You have to hash tables with no duplicate keys. You are just fooling yourself and there is now ay to figure out what you are doing with this.

Free Windows Admin Tool Kit Click here and download it now
August 29th, 2015 12:13am

You are right jrv.

I meant to say 'I have 2 Hash Tables and 99% of the time the condition is the same key will occur once in each table'

August 29th, 2015 12:38am

Yes a key can appear once in each table.  Why is that a problem?

To compare to object collections just use "Compare-Object".

Free Windows Admin Tool Kit Click here and download it now
August 29th, 2015 12:46am

Find matching key:

$EmpNames=@{
	John='John'
	Dave=223344
	Justine='Justine'
	Rose=223344
	Jim=223344
}
$EmpNumbers=@{
	'John Doe'=112233
	'Dave Davis'=223344
	'Justine Smith'=334455
	'Rose Jones'=223344
	'Jim Bob'=223344
	'Jim'=223344
	'Dave Davis1'=223344
}
# Find keys that exist in both hashes
$empnumbers.Keys|?{$empnames.ContainsKey($_)}

This is how to build a couple of hashes and how to find a matching key or keys.

Try not to superstitiously write lines of code that actually do nothing.  Most of your code does nothing.  Look closely at the following:

$EmpNames = $null
$EmpNames = @{}
$EmpNames = @{John = "John"; Dave = 223344; Justine = "Justine"; Rose = 223344; "Jim" = 223344}

You assign the same variable three different times but only the last one is needed.  Why do the other two.

You also tend to double quote anything that doesn't move.  Why?

August 29th, 2015 1:03am

The same key in two different tables is not a problem, but a condition which will always be present.  My real problem is in each hashtable, the same key/value pair will occur, but then in the second hash table the value will occur on 1 to n number of keys, and I need to understand what keys these values occur on.

The practical problem here is the key is a SamAccountName and the Value is its mail attribute.  In the 2nd hash table, the same mail attribute will occur on 1 to n number of keys.  I need to find all keys were this duplicate mail value occurs and save to a file all the key/value pairs having the same value.

The ContainsValue method is fast, and in the posted code I use the value from the first table to search the 2nd and when found, I dont know how to display the keys associated with the value.

These hash tables will be contain anywhere from 50K to 150K Key/Value

Free Windows Admin Tool Kit Click here and download it now
August 29th, 2015 1:11am

Now that is the third totally different explanation and it is not a question.

The code I posted will find what you asked for which was "duplicated keys".  Now you are asking for "duplicated values". 

August 29th, 2015 1:15am

Find all matching hash members.

$EmpNumbers=@{
	'John Doe'=112233
	'Dave Davis'=223344
	'Justine Smith'=334455
	'Rose Jones'=223344
	'Jim Bob'=223344
	'Jim'=223344
	'Dave Davis1'=223344
}
PS C:\scripts> $empnumbers.GetEnumerator()|?{$_.Value -eq 223344}

Name                           Value
----                           -----
Jim                            223344
Rose Jones                     223344
Jim Bob                        223344
Dave Davis1                    223344
Dave Davis                     223344


Free Windows Admin Tool Kit Click here and download it now
August 29th, 2015 1:20am

Try not to superstitiously write lines of code that actually do nothing.  Most of your code does nothing.  Look closely at the following:

$EmpNames = $null
$EmpNames = @{}
$EmpNames = @{John = "John"; Dave = 223344; Justine = "Justine"; Rose = 223344; "Jim" = 223344}

You assign the same variable three different times but only the last one is needed.  Why do the other two.

You also tend to double quote anything that doesn't move.  Why?

How am I superstitiously writing lines of code that actually do nothing?  As I posted above, the real issue is finding duplicate Values in my second table, which is exactly what the 2nd hash table $EmpNames represents.  These are sample hash tables which represent the real world hash tables of 50k-150k key/value pairs, or SamAccountName/Mail pairs with numerous duplicate mail attributes.

August 29th, 2015 1:41am

The real world doesn't matter. In code you need to understand what each lines is doing and why.  You are writing many lines to do something that can be done in one line.  It makes what you are doing nearly impossible for anyone to understand and is clearly a result of lack of understanding of how the code or language work.

My recommendation is intended to help you see that these things are easy once you understand the basics of computer coding and the specific syntaxes and elements of the language you are using.

I posted the simple method for finding all matching values in a hash.  This is fast and can work with large hashes.  For hashes of thousands of values you would be better off using ava database or even a spreadsheet as these systems are designed to do things like find matches across multiple "tuple-like" objects.

With a database we can just join the two hashes and all matching elements would be listed.  We can also, wit one small option, extract only duplicated values.  While this is possible with small hashes it becomes problematic with very large data sets.

You can use the line I posted in a simple loop.  This will output all equal values in both hashes.  It should take no more than two or three lines to do this.

What I am trying to show is that these things do not take huge long scripts. PowerShell has been uniquely designed to make this kind of operation very easy but you must learn the basics of how PowerShell is implemented to help with this. Also note that nearly everything in PowerShell is an extension of the Net classes that it is built on.

Writing many lines of "superstitious" code makes it hard for any of us to understand what you are doing and it makes it much harder for you to understand what you have written and to debug it.  Only write code that you understand.  Writinglines of code because you saw it somewhere is "superstitious"  It will get you into trouble.

Free Windows Admin Tool Kit Click here and download it now
August 29th, 2015 2:01am

What prompted you to write this script in the first place? Are you just trying to find duplicate mail attributes in Active Directory?
August 29th, 2015 3:58am

Yes, however not only duplicates but we've seen a particular mail attribute can occur any number of times on other SamAccountNames so we need to identify all SamAccountNames where a particular mail attribute is set.  Our Directory has 200K+ accounts and we are looking to run this process daily so we are splitting the accounts in 4 groups for performance reasons which is why we have 2 Hash Tables (ie. Group 1 Hash and All Users Hash).
Free Windows Admin Tool Kit Click here and download it now
August 29th, 2015 8:51am

Get-ADUser -Filter 'mail -like "*"' -Properties mail | group mail | where Count -gt 1
This will list duplicate mail and the Group property will contain an array of user accounts for a particular mail address.
August 29th, 2015 10:01am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics