Finding all text between quotes on a line of text using Powershell regex

Hi, I'm looking to filter some information from the source code of some website. I need to be able to output any text between quote marks - have tried very hard to get it to work but cannot. The information I need is all in one of many lines on the page which begins with the text: new Module_Members(71933, {"group_by_titles"

Here is an excerpt, I'm mainly interested in names, post_count, datejoined, location

{"team_id":"15212","post_count":"29","chatstatus":"online","lastseen":"<div class=\"online\">Online Now<\/div>","datejoined":"May 8, 14","user_id":"776664","group_name":null,"name":"David","displayname":"David","location":"GB"} {"team_id":"15112","post_count":"33","chatstatus":"online","lastseen":"<div class=\"online\">Online Now<\/div>","datejoined":"May 8, 13","user_id":"776624","group_name":null,"name":"Peter","displayname":"Peter","location":"US"}

The best I could come up with is this:

$file=gcC:\Users\desktop\sr.htm|select-string-pattern"group_by_titles"

$matches_found = @()

$file   |%{

if($_-match'(?<=")[^"]*(?=")'){$matches_found+=$matches[1]}

}

$matches

output is:

Name                           Value                                                                                              

----                           -----                                                                                              

0                              group_by_titles    

    




  • Edited by Quarkspace 19 hours 42 minutes ago too much at bottom
February 23rd, 2015 10:53am

It looks like JSON formatted data, so you can use ConvertFrom-JSON to do all the hard work. Here's an example:

Get-Content "test.txt" | ForEach-Object {
    ConvertFrom-Json $_ | Select-Object name,post_count,datejoined,location
}


Free Windows Admin Tool Kit Click here and download it now
February 23rd, 2015 11:15am

A JSON file will look like this:

[
   {
      "team_id":"15212",
      "post_count":"29",
      "chatstatus":"online",
      "lastseen":"<div class=\"online\">Online Now<\/div>",
      "datejoined":"May 8, 14",
       "user_id":"776664",
      "group_name":null,
      "name":"David",
      "displayname":"David",
       "location":"GB"
   },
   {
      "team_id":"15112",
      "post_count":"33",
      "chatstatus":"online",
      "lastseen":"<div class=\"online\">Online Now<\/div>",
      "datejoined":"May 8, 13",
      "user_id":"776624",
      "group_name":null,
      "name":"Peter",
      "displayname":"Peter",
      "location":"US"
   }
]

Of course it can be unformatted and it can be more complex.

To convert a trye Json file just convert it.

Get-Content <jsonfile> | ConvertFrom-Json

You will get back objects.

February 23rd, 2015 11:26am

Thanks.  does look like JSON code but the Powershell ConvertFrom-Json doesn't seem to like it: when I point it to the sample text, an htm file with the source code in and finally at the website URL (please highlight very faint text to see)

PS U:\> $file = gc C:\Users\sbbh4\desktop\sr.txt | convertfrom-json


convertfrom-json : Invalid JSON primitive: "team_id":"15112","post_count":"33","chatstatus":"online","lastseen":"<div

class=\"online\">Online Now<\/div>","datejoined":"May 8,

13","user_id":"776624","group_name":null,"name":"Peter","displayname":"Peter","location":"US"}.

At line:1 char:46

+ $file = gc C:\Users\sbbh\desktop\sr.txt | convertfrom-json

+                                              ~~~~~~~~~~~~~~~~

    + CategoryInfo          : NotSpecified: (:) [ConvertFrom-Json], ArgumentException

    + FullyQualifiedErrorId : System.ArgumentException,Microsoft.PowerShell.Commands.ConvertFromJsonCommand

$file = gc C:\Users\sbb\desktop\sr.htm | select-string -pattern "group_by_titles"


convertfrom-json : Invalid JSON primitive: new.

At line:2 char:9

+ $file | convertfrom-json

+         ~~~~~~~~~~~~~~~~

    + CategoryInfo          : NotSpecified: (:) [ConvertFrom-Json], ArgumentException

    + FullyQualifiedErrorId : System.ArgumentException,Microsoft.PowerShell.Commands.ConvertFromJsonCommand

PS U:\> $j = Invoke-WebRequest -Uri icantpostlinks| ConvertFrom-Json


ConvertFrom-Json : Invalid JSON primitive: .

At line:1 char:72

+ $j = Invoke-WebRequest -Uri icantpostlinks | ConvertFr ...

+                                                                        ~~~~~~~~~

    + CategoryInfo          : NotSpecified: (:) [ConvertFrom-Json], ArgumentException

    + FullyQualifiedErrorId : System.ArgumentException,Microsoft.PowerShell.Commands.ConvertFromJsonCommand

Free Windows Admin Tool Kit Click here and download it now
February 24th, 2015 4:48am

JSON cannot be parsed by regular expressions because it's not a regular language, see http://en.wikipedia.org/wiki/Regular_language

If ConvertFrom-Json doesn't work it's likely because of one or a few small syntax errors in the JSON itself. In your OP you have the block

{"team_id":"15212","post_count":"29","chatstatus":"online","lastseen":"<div class=\"online\">Online Now<\/div>","datejoined":"May 8, 14","user_id":"776664","group_name":null,"name":"David","displayname":"David","location":"GB"} {"team_id":"15112","post_count":"33","chatstatus":"online","lastseen":"<div class=\"online\">Online Now<\/div>","datejoined":"May 8, 13","user_id":"776624","group_name":null,"name":"Peter","displayname":"Peter","location":"US"}

which throws an error in Powershell because it is actually two objects.

If you wrap them in an array like

[{"team_id":"15212","post_count":"29","chatstatus":"online","lastseen":"<div class=\"online\">Online Now<\/div>","datejoined":"May 8, 14","user_id":"776664","group_name":null,"name":"David","displayname":"David","location":"GB"}, {"team_id":"15112","post_count":"33","chatstatus":"online","lastseen":"<div class=\"online\">Online Now<\/div>","datejoined":"May 8, 13","user_id":"776624","group_name":null,"name":"Peter","displayname":"Peter","location":"US"}]

it works correctly.

It'll be much easier to fix your JSON data than to somewhat haphazardly parse all the special cases of quotes-within-quotes in regular expressions.

February 24th, 2015 6:32am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics