Parse a file to extract Distinguished name (Network Steve Forum)

Parse a file to extract Distinguished name

I have an XML file (lets call it test.xml) where part of it looks like this

<export-error cs-guid="{11111C00-11A2-E411-80BB-002226874D89}" dn="CN=Doe\, John,OU=Billing,OU=Chicago,OU=IL,OU=US,DC=XYZ,DC=local">

There are a bunch of lines similar to this and the rest of the xml is stuff I don't care about.

How can I get the output to be just this:

"CN=Doe\, John,OU=Billing,OU=Chicago,OU=IL,OU=US,DC=XYZ,DC=local"

There are actually several Distinguished names I want out of that file. Basically grab the stuff betwee dn= and > symbol

I tried this select-string ./test.xml -pattern "dn" but it splits the output after the space into another line, and I only want the DN. I tried regular expressions, but I can't make regular expressions to save my butt.

This i what I have so far.

$input_path = './test.xml'
$output_file = './out.txt'
$regex = '[\dn=]{1}([A-Za-z.]*)[\>"]{1}'
select-string -Path $input_path -Pattern $regex -AllMatches | % { $_.Matches } | % { $_.Value } > $output_file

Can someone help me with the proper $regex line?

February 14th, 2015 10:41pm

If you want to use regex for this, the [regex]::Matches static method is actually easier than Select-String, IMHO.

@'
<export-error cs-guid="{11111C00-11A2-E411-80BB-002226874D89}" dn="CN=Doe\, John,OU=Billing,OU=Chicago,OU=IL,OU=US,DC=XYZ,DC=local">
<export-error cs-guid="{11111C00-11A2-E411-80BB-002226874D89}" dn="CN=Doe\, Jane,OU=Marketing,OU=Chicago,OU=IL,OU=US,DC=XYZ,DC=local">

'@ | Set-Content test.xml

$input_path = './test.xml'
$output_file = './out.txt'

$text = Get-Content $input_path -Raw

[regex]::Matches($text,'dn="(.+?)"') |
foreach {$_.groups[1].value} |
Set-Content $output_file

Get-Content $output_file

CN=Doe\, John,OU=Billing,OU=Chicago,OU=IL,OU=US,DC=XYZ,DC=local
CN=Doe\, Jane,OU=Marketing,OU=Chicago,OU=IL,OU=US,DC=XYZ,DC=local

Free Windows Admin Tool Kit Click here and download it now

February 14th, 2015 11:08pm

I got this to work:

$regex = '(dn=\")(.+)(\">)'
select-string -Path $input_path -Pattern $regex -AllMatches | % { $_.Matches } | % { $_.Value.SubString(4).TrimEnd('">') } > $output_file

February 14th, 2015 11:24pm

The easiest and most reliable way to extract from xml is like this:

$txt=@'
<someroot>
<export-error cs-guid="{11111C00-11A2-E411-80BB-002226874D89}" dn="CN=Doe\, John,OU=Billing,OU=Chicago,OU=IL,OU=US,DC=XYZ,DC=local"/>
<export-error cs-guid="{11111C00-11A2-E411-80BB-002226874D89}" dn="CN=Doe\, Jane,OU=Marketing,OU=Chicago,OU=IL,OU=US,DC=XYZ,DC=local"/>
</someroot>
'@
$xml=[xml]$txt
$xml.SelectNodes('//export-error')|select -expand dn

Free Windows Admin Tool Kit Click here and download it now

February 15th, 2015 12:31am

This topic is archived. No further replies will be accepted.