Parse Text With Regular Expression

Hello, Dear Colleagues.

Could you help me please parse text with regular expression:

--=_alternative XXXXXXXXXXXXXX_=
Content-Type: text/plain; charset="KOI8-R"
Content-Transfer-Encoding: base64

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

--=_alternative XXXXXXXXXXXXXX_=
Content-Type: text/html; charset="KOI8-R"
Content-Transfer-Encoding: base64

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
--=_alternative XXXXXXXXXXXXXX_=--
--=_related XXXXXXXXXXXXXX_=--_=
Content-Type: image/jpeg
Content-ID: <_2_XXXXXXXXXXXXXX>
Content-Transfer-Encoding: base64

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
--=_related XXXXXXXXXXXXXX_=
Content-Type: image/jpeg
Content-ID: <_2_XXXXXXXXXXXXXX>
Content-Transfer-Encoding: base64
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

--=_related XXXXXXXXXXXXXX_=
Content-Type: image/jpeg
Content-ID: <_2_XXXXXXXXXXXXXX>
Content-Transfer-Encoding: base64

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
--=_related XXXXXXXXXXXXXX_=--

I want to get separately:

1). text between  Content-Transfer-Encoding: base64 and --=_alternative, if there is above line Content-Type: text/html 

2). text between  Content-Transfer-Encoding: base64 and --=_related, if there is two lines above line Content-Type: image/jpeg

In this example it will be

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

If to use "greedy" RegEx

$regex = "(?ms).+?Content-Type: image/jpeg(.+?)--=_related"

I get all text between first Content-Type: image/jpeg and last --=_related

Content-ID: <_2_XXXXXXXXXXXXXX>
Content-Transfer-Encoding: base64

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
--=_related XXXXXXXXXXXXXX_=
Content-Type: image/jpeg
Content-ID: <_2_XXXXXXXXXXXXXX>
Content-Transfer-Encoding: base64
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

--=_related XXXXXXXXXXXXXX_=
Content-Type: image/jpeg
Content-ID: <_2_XXXXXXXXXXXXXX>
Content-Transfer-Encoding: base64

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

How to make RegEx "lazy" and get all pieces separately?

Thanks.





  • Edited by fapw 16 hours 10 minutes ago
July 20th, 2015 10:01am

Thanks for your reply, D'Thompson.

Maybe I didn't clearly described what I want. Sorry for that. I want something like this. Take a look please.

$text = @"
--=_alternative XXXXXXXXXXXXXX_=
Content-Type: text/html; charset="KOI8-R"
Content-Transfer-Encoding: base64

111111111111111111111111111111111111111111111111111111

--=_alternative XXXXXXXXXXXXXX_=
Content-Type: text/html; charset="KOI8-R"
Content-Transfer-Encoding: base64

222222222222222222222222222222222222222222222222222222
--=_alternative XXXXXXXXXXXXXX_=--
--=_related XXXXXXXXXXXXXX_=--_=
Content-Type: image/jpeg
Content-ID: <_2_XXXXXXXXXXXXXX>
Content-Transfer-Encoding: base64

333333333333333333333333333333333333333333333333333333
--=_related XXXXXXXXXXXXXX_=
Content-Type: image/jpeg
Content-ID: <_2_XXXXXXXXXXXXXX>
Content-Transfer-Encoding: base64
444444444444444444444444444444444444444444444444444444

--=_related XXXXXXXXXXXXXX_=
Content-Type: image/jpeg
Content-ID: <_2_XXXXXXXXXXXXXX>
Content-Transfer-Encoding: base64

555555555555555555555555555555555555555555555555555555
--=_related XXXXXXXXXXXXXX_=--
"@

$regex1 = "(?ms).+?Content-Transfer-Encoding: base64(.+?)--=_alternative"
$text1 = ([regex]::Matches($text,$regex1) | foreach {$_.groups[1].value})
Write-Host "text1 : " -fore red
Write-Host  $text1

#I want to get as output elements (of array, maybe, or one after another)
#1). text between  Content-Transfer-Encoding: base64 and --=_alternative, if there is above line Content-Type: text/html
#this
#1111111111111111111111111111111111111111111111111111111
#then this
#2222222222222222222222222222222222222222222222222222222

$regex2 = "(?ms).+?Content-Transfer-Encoding: base64(.+?)--=_related"
$text2 = ([regex]::Matches($text,$regex2) | foreach {$_.groups[1].value})
#I want to get as output elements (of array, maybe, or one after another)
#2). text between  Content-Transfer-Encoding: base64 and --=_related, if there is two lines above line Content-Type: image/jpeg
#this
#3333333333333333333333333333333333333333333333333333333
#then this
#4444444444444444444444444444444444444444444444444444444
#then this
#5555555555555555555555555555555555555555555555555555555
Write-Host "text2 : " -fore red
Write-Host  $text2
  • Edited by fapw 16 minutes ago
Free Windows Admin Tool Kit Click here and download it now
July 21st, 2015 2:11am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics