Parse Text With Regular Expression

Hello, Dear Colleagues.

Could you help me please parse text with regular expression:

--=_alternative XXXXXXXXXXXXXX_=
Content-Type: text/plain; charset="KOI8-R"
Content-Transfer-Encoding: base64

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

--=_alternative XXXXXXXXXXXXXX_=
Content-Type: text/html; charset="KOI8-R"
Content-Transfer-Encoding: base64

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
--=_alternative XXXXXXXXXXXXXX_=--
--=_related XXXXXXXXXXXXXX_=--_=
Content-Type: image/jpeg
Content-ID: <_2_XXXXXXXXXXXXXX>
Content-Transfer-Encoding: base64

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
--=_related XXXXXXXXXXXXXX_=
Content-Type: image/jpeg
Content-ID: <_2_XXXXXXXXXXXXXX>
Content-Transfer-Encoding: base64
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

--=_related XXXXXXXXXXXXXX_=
Content-Type: image/jpeg
Content-ID: <_2_XXXXXXXXXXXXXX>
Content-Transfer-Encoding: base64

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
--=_related XXXXXXXXXXXXXX_=--

I want to get separately:

1). text between  Content-Transfer-Encoding: base64 and --=_alternative, if there is above line Content-Type: text/html 

2). text between  Content-Transfer-Encoding: base64 and --=_related, if there is two lines above line Content-Type: image/jpeg

In this example it will be

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

If to use "greedy" RegEx

$regex = "(?ms).+?Content-Type: image/jpeg(.+?)--=_related"

I get all text between first Content-Type: image/jpeg and last --=_related

Content-ID: <_2_XXXXXXXXXXXXXX>
Content-Transfer-Encoding: base64

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
--=_related XXXXXXXXXXXXXX_=
Content-Type: image/jpeg
Content-ID: <_2_XXXXXXXXXXXXXX>
Content-Transfer-Encoding: base64
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

--=_related XXXXXXXXXXXXXX_=
Content-Type: image/jpeg
Content-ID: <_2_XXXXXXXXXXXXXX>
Content-Transfer-Encoding: base64

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

How to make RegEx "lazy" and get all pieces separately?

Thanks.





  • Edited by fapw 16 hours 0 minutes ago
July 20th, 2015 10:05am

If you just want the images then load the message into Outlook and save the images in the image collection.

Free Windows Admin Tool Kit Click here and download it now
July 20th, 2015 10:55am

Yes, you are right jrv, but I need to implement this with powershell.
July 20th, 2015 11:06am

Is this what you mean?

This works for me:

$text = @"
--=_alternative XXXXXXXXXXXXXX_=
Content-Type: text/plain; charset="KOI8-R"
Content-Transfer-Encoding: base64

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

--=_alternative XXXXXXXXXXXXXX_=
Content-Type: text/html; charset="KOI8-R"
Content-Transfer-Encoding: base64

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
--=_alternative XXXXXXXXXXXXXX_=--
--=_related XXXXXXXXXXXXXX_=--_=
Content-Type: image/jpeg
Content-ID: <_2_XXXXXXXXXXXXXX>
Content-Transfer-Encoding: base64

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
--=_related XXXXXXXXXXXXXX_=
Content-Type: image/jpeg
Content-ID: <_2_XXXXXXXXXXXXXX>
Content-Transfer-Encoding: base64
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

--=_related XXXXXXXXXXXXXX_=
Content-Type: image/jpeg
Content-ID: <_2_XXXXXXXXXXXXXX>
Content-Transfer-Encoding: base64

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
--=_related XXXXXXXXXXXXXX_=--
"@

$regex = "(?ms).+?Content-Type: image/jpeg(.+?)--=_related"
$text -replace $regex,'$1'

This results in:

Content-ID: <_2_XXXXXXXXXXXXXX>
Content-Transfer-Encoding: base64

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

Content-ID: <_2_XXXXXXXXXXXXXX>
Content-Transfer-Encoding: base64
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA


Content-ID: <_2_XXXXXXXXXXXXXX>
Content-Transfer-Encoding: base64

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
 XXXXXXXXXXXXXX_=--

Free Windows Admin Tool Kit Click here and download it now
July 20th, 2015 11:58am

Thanks for your reply, D'Thompson.

Maybe I didn't clearly described what I want. Sorry for that. I want something like this. Take a look please.

$text = @"
--=_alternative XXXXXXXXXXXXXX_=
Content-Type: text/html; charset="KOI8-R"
Content-Transfer-Encoding: base64

111111111111111111111111111111111111111111111111111111

--=_alternative XXXXXXXXXXXXXX_=
Content-Type: text/html; charset="KOI8-R"
Content-Transfer-Encoding: base64

222222222222222222222222222222222222222222222222222222
--=_alternative XXXXXXXXXXXXXX_=--
--=_related XXXXXXXXXXXXXX_=--_=
Content-Type: image/jpeg
Content-ID: <_2_XXXXXXXXXXXXXX>
Content-Transfer-Encoding: base64

333333333333333333333333333333333333333333333333333333
--=_related XXXXXXXXXXXXXX_=
Content-Type: image/jpeg
Content-ID: <_2_XXXXXXXXXXXXXX>
Content-Transfer-Encoding: base64
444444444444444444444444444444444444444444444444444444

--=_related XXXXXXXXXXXXXX_=
Content-Type: image/jpeg
Content-ID: <_2_XXXXXXXXXXXXXX>
Content-Transfer-Encoding: base64

555555555555555555555555555555555555555555555555555555
--=_related XXXXXXXXXXXXXX_=--
"@

$regex1 = "(?ms).+?Content-Transfer-Encoding: base64(.+?)--=_alternative"
$text1 = ([regex]::Matches($text,$regex1) | foreach {$_.groups[1].value})
Write-Host "text1 : " -fore red
Write-Host  $text1

#I want to get as output elements (of array, maybe, or one after another)
#1). text between  Content-Transfer-Encoding: base64 and --=_alternative, if there is above line Content-Type: text/html
#this
#1111111111111111111111111111111111111111111111111111111
#then this
#2222222222222222222222222222222222222222222222222222222

$regex2 = "(?ms).+?Content-Transfer-Encoding: base64(.+?)--=_related"
$text2 = ([regex]::Matches($text,$regex2) | foreach {$_.groups[1].value})
#I want to get as output elements (of array, maybe, or one after another)
#2). text between  Content-Transfer-Encoding: base64 and --=_related, if there is two lines above line Content-Type: image/jpeg
#this
#3333333333333333333333333333333333333333333333333333333
#then this
#4444444444444444444444444444444444444444444444444444444
#then this
#5555555555555555555555555555555555555555555555555555555
Write-Host "text2 : " -fore red
Write-Host  $text2
  • Edited by fapw 7 minutes ago
July 21st, 2015 2:15am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics