Building an array of custom objects

I am trying to use regular expressions (-imatch) to search for hyperlinks in word documents and output a list of filepath/hyperlink pairs found. Please don't laugh at my code for finding the hyperlinks. It is probably far from optimal, but it is working. What I can't get to work is probably the easier part of the project - building the table of results. What I tried to do is create a custom object called $Output with two NoteProperty members called FlPth (for the filepaths) and Hyperlink (for the hyperlinks). The I have an array called $colOutput which is supposed to collect all the individual $Outputs as I step through files containing hyperlinks. In the end I just get an array full of 20 identical pairs of FlPth/Hyperlink pairs. I know as it steps through them I am getting the right values for FlPth and Hyperlink, but something is wrong with the way I am adding them to my array. Can you tell me where my error is?

$Path = "M:\My Documents\GoodSync\SOPSnap\QA" # Temporary path to concentrate on a problem file!
#$Path = "M:\My Documents\GoodSync\SOPSnap" #This is the general path.
$Regex = '\x13\s*HYPERLINK.*\x14' # Regex to catch Hyperlinks in word docs. It works!
$Excludes = '(\\_gsdata_\\|\\Archive\\|\\Drafts\\|\\Folder Settings\\|\\SOPSignOff\\|\\SOPSignOffHistory\\|\\Trash\\|lnk$)' # paths to skip
[System.Collections.ArrayList]$colOutput = @() # Make an arraylist for the results
$idx = 0 #array index counter
$output = New-Object -TypeName PSObject
$Output | Add-Member -type NoteProperty -Name FlPth -Value ""
$Output | Add-Member -type NoteProperty -Name Hyperlink -Value ""
$output | Get-Member | Format-Table

# Get all the files in $Path that end in ".doc". Expand this later to include .docx and .docm files as well and add the code to get into them
Get-ChildItem $Path -Filter "*.doc" -Recurse | #only doc files 
   Where-Object { $_.Attributes -ne "Directory" -and $_.FullName -notmatch $Excludes} |
      ForEach-Object {
         $FlPthTemp = $_.FullName
         $MatchedLine = Get-Content $FlPthTemp | Select-String -Pattern $Regex -Allmatches |
            ForEach-Object {
                $_ -imatch '(?<=\x13\s*HYPERLINK).*(?=\x14)' | out-null; $Hlink = $Matches[0]
                $idx 
                $Output.FlPth = $FlPthTemp
                $Output.Hyperlink = $Hlink
                $colOutput.Add($Output)
            }
        }
         
          
#Export the results to a csv
$colOutput | Format-List


February 23rd, 2015 7:54pm

Hi,

I won't dive into your code, but here's an example of how to create custom objects in the pipeline and then do a CSV export:

Get-ChildItem .\ -File -Recurse | ForEach {

    $props = @{
        FullName = $_.FullName
        FileName = $_.Name
    }

    New-Object PsObject -Property $props

} | Export-Csv .\fileInfo.csv -NoTypeInformation

Free Windows Admin Tool Kit Click here and download it now
February 23rd, 2015 8:09pm

You want to output an array of PS Objects not a single object. Try:

$Path = "M:\My Documents\GoodSync\SOPSnap\QA" # Temporary path to concentrate on a problem file!
#$Path = "M:\My Documents\GoodSync\SOPSnap" #This is the general path.
$Regex = '\x13\s*HYPERLINK.*\x14' # Regex to catch Hyperlinks in word docs. It works!
$Excludes = '(\\_gsdata_\\|\\Archive\\|\\Drafts\\|\\Folder Settings\\|\\SOPSignOff\\|\\SOPSignOffHistory\\|\\Trash\\|lnk$)' # paths to skip

$Output = @() #  Just declare $Output as an array..

# Get all the files in $Path that end in ".doc". Expand this later to include .docx and .docm files as well and add the code to get into them
Get-ChildItem $Path -Filter "*.doc" -Recurse | #only doc files 
   Where-Object { $_.Attributes -ne "Directory" -and $_.FullName -notmatch $Excludes} |
      ForEach-Object {
         $FlPthTemp = $_.FullName
         $MatchedLine = Get-Content $FlPthTemp | Select-String -Pattern $Regex -Allmatches |
            ForEach-Object {
                $_ -imatch '(?<=\x13\s*HYPERLINK).*(?=\x14)' | out-null

                $Output += New-Object -TypeName PSObject -Property @{ FlPth = $FlPthTemp; Hyperlink = $Matches[0] }

            }
        }
         
          
#Export the results to a csv
$Output | Format-List

So, instead of creating your object at the top, you simply declare/initialize $Output as an array. 

In the inner for-each loop, you add another element to that empty $Output array. Each element is a PS Object having the 2 properties FilePath and HyperLink..

February 24th, 2015 5:53am

That worked perfectly. Apparently I have a little more tweaking to do to my regular expression because a small percentage of found hyperlinks include things that are not part of the actual hyperlinks, but your suggestion definitely solved my initial problem!

Thanks!

Free Windows Admin Tool Kit Click here and download it now
March 7th, 2015 10:25am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics