A (long-winded) guide on regex in 'Text' module functions

This may be “much ado about nothing”, my apologies in advance.

I ran into a string returned from a node.js console app that proved difficult to massage into shape. The code below is a step-by-step recounting of what I did to get the results I needed.

My difficulties stemmed from needing to use ‘Text.EscapeForRegularExpression’ and not having done so.

The code here is way too long for a screenshot you can’t copy anyway, so here’s a snippet of the action that helped solve my issues.

snippet

Experienced ‘regex-ers’ can likely pass on this, as it’ll be too simple. I took everything one step at a time, which is almost assuredly not necessary. I do hope it can be of some help.

Screenshot of the essential part of the output:
output

And code:

# There ar MORE EFFICIENT AND CONCISE ways to do this.
# This is an introduction ONLY and will be familiar ground
# to lots of readers. But not all, I trust ...
#
# Like regexes themselves, "efficient and concise" doesn't
# always (ever?) mean "easy to understand".
# This guide, therefore, takes it one step at a time.
#
# We'll create a string that WILL cause regex problems, but is a real-world example.
# The info below was acquired from a node.js library, "active-win".
# It required some processing to achieve the string below ('toughString').
# If anyone's interested, I could post that processing.

set toughString to """{'platform':'windows','title':'myApp','id':123,'owner':{'name':'yoursTruly','processId':456,'path':'C:\Users'},'bounds':{'x':10,'y':10,'width':10,'height':10},'memoryUsage':25988}"""
# We'll escape it so we can do Robin regex on it.
Text.EscapeForRegularExpression \
    Text: toughString \
    EscapedText=> tamedString
# Escapes  (\, *, +, ?, |, {, [, (, ), ^, $,., #, and white space)
# Our string has only '{'
Console.Write Message: tamedString
# As you can see, all '{' characters have a '\' prepended.
# Here, we want to get rid of the keys, and keep the values.
# That will take more than one step (at least for yours truly) :)
Text.Replace \
    Text:  tamedString \
    TextToFind: "(\\{'platform':)" \
    ReplaceWith: '' \
    IsRegEx:True \
    IgnoreCase:False \
    ActivateEscapeSequences:False \
    Result=> ReplacedText
# Now our string will create a new challenge for us.
set rebelString to ReplacedText
# 'C:\Users' has become 'C:\\Users'.
Console.Write Message: rebelString
# So how do we massage that back into shape?
Text.Replace \
    Text:  rebelString \
    TextToFind: "(\\\\)" \
    ReplaceWith: '\\' \
    IsRegEx:True \
    IgnoreCase:False \
    ActivateEscapeSequences:False \
    Result=> rebelText
# Let's see what we have.
Console.Write Message: rebelText
# Now let's get rid of the 'title' key. This is easier.
Text.Replace \
    Text:  rebelText \
    TextToFind: "('title':)" \
    ReplaceWith: '' \
    IsRegEx:True \
    IgnoreCase:False \
    ActivateEscapeSequences:False \
    Result=> noTitleText
# No title now ...
Console.Write Message: noTitleText
# Another easy one - no ID key.
Text.Replace \
    Text:  noTitleText \
    TextToFind: "('id':)" \
    ReplaceWith: '' \
    IsRegEx:True \
    IgnoreCase:False \
    ActivateEscapeSequences:False \
    Result=> noIdText
Console.Write Message: noIdText
# Now let's get rid of some extraneous object-related stuff.
# We don't need to know most of these keys.
# Once again we need to double-escape (if that's a thing) :)
Text.Replace \
    Text:  noIdText \
    TextToFind: "('owner':\\{'name':)" \
    ReplaceWith: '' \
    IsRegEx:True \
    IgnoreCase:False \
    ActivateEscapeSequences:False \
    Result=> noOwnerText
# Let's see if the offeding info is gone.
Console.Write Message: noOwnerText
# No 'processID' key ...
Text.Replace \
    Text:  noOwnerText \
    TextToFind: "('processId':)" \
    ReplaceWith: '' \
    IsRegEx:True \
    IgnoreCase:False \
    ActivateEscapeSequences:False \
    Result=> noPIDText
Console.Write Message: noPIDText
# No 'path' key ...
Text.Replace \
    Text:  noPIDText \
    TextToFind: "('path':)" \
    ReplaceWith: '' \
    IsRegEx:True \
    IgnoreCase:False \
    ActivateEscapeSequences:False \
    Result=> noPIDText
Console.Write Message: noPIDText
# Here we have an example of a brace we could have gotten rid of earlier:
# :(
Text.Replace \
    Text:  noPIDText \
    TextToFind: "(})" \
    ReplaceWith: '' \
    IsRegEx:True \
    IgnoreCase:False \
    ActivateEscapeSequences:False \
    Result=> noRightBraceText
# No brace key ...
Console.Write Message: noRightBraceText
# Now for the screen coord keys ...
Text.Replace \
    Text:  noRightBraceText \
    TextToFind: "('bounds':\\{)" \
    ReplaceWith: '' \
    IsRegEx:True \
    IgnoreCase:False \
    ActivateEscapeSequences:False \
    Result=> noBoundsText
Console.Write Message: noBoundsText

Text.Replace \
    Text:  noBoundsText \
    TextToFind: "('x':)" \
    ReplaceWith: '' \
    IsRegEx:True \
    IgnoreCase:False \
    ActivateEscapeSequences:False \
    Result=> noXText

Console.Write Message: noXText

Text.Replace \
    Text:  noXText \
    TextToFind: "('y':)" \
    ReplaceWith: '' \
    IsRegEx:True \
    IgnoreCase:False \
    ActivateEscapeSequences:False \
    Result=> noYText

Console.Write Message: noYText

Text.Replace \
    Text:  noYText \
    TextToFind: "('width':)" \
    ReplaceWith: '' \
    IsRegEx:True \
    IgnoreCase:False \
    ActivateEscapeSequences:False \
    Result=> noWidthText

Console.Write Message: noWidthText

Text.Replace \
    Text:  noWidthText \
    TextToFind: "('height':)" \
    ReplaceWith: '' \
    IsRegEx:True \
    IgnoreCase:False \
    ActivateEscapeSequences:False \
    Result=> noHeightText

Console.Write Message: noHeightText

# And finally, no 'memoryUsage' key.
Text.Replace \
    Text:  noHeightText \
    TextToFind: "('memoryUsage':)" \
    ReplaceWith: '' \
    IsRegEx:True \
    IgnoreCase:False \
    ActivateEscapeSequences:False \
    Result=> noMemText

Console.Write Message: noMemText

# Smoke test:

Text.SplitWithDelimiter \
    Text:  noMemText\
    CustomDelimiter: ',' \
    IsRegEx:False \
    Result=> TextList

# Now we'll print some info that may be useful.
Console.Write Message: TextList
Console.Write Message: "OS is " + TextList[0]
Console.Write Message: "Hwnd is " + TextList[2]
Console.Write Message: "PID is " + TextList[4]
Console.Write Message: "Path is " + TextList[5]

# Now we have the hwnd (id), PID, and path we 
# can use elsewhere in Robin.

Best regards,
burque505

4 Likes