Intro guide: Extract data with JS when no HTML table is available

After watching a WinAutomation video on extracting data from web pages in a variety of ways, I found myself wishing for a way to do the same in Robin. Even though we don’t (yet) have ‘WebAutomation.DataExtraction.ExtractData’, I bet we will. :smiley:

For now, taking a tip from WA here, we can snag lots of things from web pages even without the module I’m wishing for.

We’ll do the following:

  1. Set a URL to search for cheap guitar looper pedals.
  2. Launch Chrome.
  3. Execute a JS function to extract the ‘innerText’ of item titles and push them to an array.
  4. Split the array with a delimiter (",") into a list.
  5. Show the list in a message box.
  6. Write one of the titles (item 6, i.e. array[5]) to console.

Screenshot:

Code:

# Please see this page: 
# https://support.softomotive.com/support/solutions/articles/35000143406-return-an-array-list-variable-with-javascript
# As we don't (yet) have WebAutomation.DataExtraction.ExtractData (I think we will - the capability is in the DLL),
# we have an alternative thoughtfully provided by WinAutomation at the page above.

# This is an ebay search for guitar looper pedals (cheap ones!).
set myUrl to 'https://www.ebay.com/sch/i.html?_from=R40&_nkw=looper%20pedal&_sacat=0&LH_TitleDesc=0&_udlo=1&_udhi=40&rt=nc'

WebAutomation.LaunchChrome                  Url:  myUrl\
                                            WindowState:WebAutomation.BrowserWindowState.Maximized \
                                            ClearCache:False \
                                            ClearCookies:False \
                                            BrowserInstance=> Browser
# Delay for safety's sake.
wait 2

# This JS fuction is taken directly from the WA page above, but altered to retrieve
# different elements.

WebAutomation.ExecuteJavascript             BrowserInstance:  Browser\
                                            Javascript:"""
                                            function ExecuteScript() 
                                            { 
                                                var array =[];
                                                var items = document.querySelectorAll('h3[class^="s-item__title"]');
                                                items.forEach(function(element){
                                                array.push(element.innerText);
                                                });

                                            return array;
                                            }
                                            """ \
                                            Result=> Result

wait 2


Text.SplitWithDelimiter                     Text:  Result \
                                            CustomDelimiter: ',' \
                                            IsRegEx:False \
                                            Result=> TextList

Display.ShowMessage                         Title:'' Message:TextList \
                                            Icon:Display.Icon.None \
                                            Buttons:Display.Buttons.OK \
                                            DefaultButton:Display.DefaultButton.Button1 \
                                            IsTopMost:False ButtonPressed=> ButtonPressed

Console.Write                               Message: TextList[5]

Truncated message (first 6 items):

Console output:

output

And a picture of our victim :grinning::

our%20victim

This is a very basic introduction to the topic. I hope it’ll be useful.

Regards,
burque505

4 Likes

Post deleted, moved to this page.