Web to Excel, quick and dirty - JS and file system

At times it might be easier and faster to use Javascript than to create an appmask for a website. You can also use the fact that Excel opens ‘.tsv’ files cleanly to your advantage, avoiding the need to write Excel cells all at once or separately.

The script below -

  • Opens the Dropcatch.com/auctions site
  • Uses JS to grab the table
  • Displays the results in the console
  • Writes the results to ‘dc.tsv’
  • Closes the browser
  • Launches Excel and opens ‘dc.tsv’

Script shot:

Code:

set myURL to "https://www.dropcatch.com/auctions"

WebAutomation.LaunchChrome      Url: myURL \
                                WindowState:WebAutomation.BrowserWindowState.Maximized \
                                ClearCache:False ClearCookies:False \
                                BrowserInstance=> Browser

wait 5

WebAutomation.ExecuteJavascript BrowserInstance:  Browser  \
                                Javascript:"""
                                function ExecuteScript() { 
                                    var items = document.querySelectorAll('div.table-list-item');
                                    var docStr = "";
                                    var tmp = "";
                                    var res = "";
                                    items.forEach(function(element){
                                       tmp = element.innerText;
                                       res = tmp.replace(/(\r\n|\n|\r)/gm, "\t");
                                        docStr += res + "\n";
                                        });
                                       
                                       return docStr;
                                }
                                """ \ 
                                Result=> Result
                                
Console.Write Message: Result

WebAutomation.CloseWebBrowser BrowserInstance: Browser

file.WriteText File: "C:\work\RobinTests\dc.tsv" \
               TextToWrite: Result \
               AppendNewLine:True \
               IfFileExists:File.IfFileExists.Overwrite \
               Encoding:File.FileEncoding.Unicode

wait 1

Excel.LaunchAndOpen Path: "C:\work\RobinTests\dc.tsv" \
                    Visible:True \
                    Password:'' \
                    ReadOnly:False \
                    LoadAddInsAndMacros:False \
                    Instance=> ExcelInstance

The Excel screenshot below ‘cheats’ a little - I manually auto-formatted column width. I also didn’t remove the ‘Bid’ Column, either manually or in code.

Regards,
burque505

1 Like

A person might want to know how to click the “Next Page” button on DropCatch auctions, as there are thousands of pages. :grinning:

This code works. It could be combined with the code from the previous post to grab multiple pages, if one were so inclined.

set myURL to "https://www.dropcatch.com/auctions"

WebAutomation.LaunchChrome      Url: myURL \
                                WindowState:WebAutomation.BrowserWindowState.Maximized \
                                ClearCache:False ClearCookies:False \
                                BrowserInstance=> Browser

wait 5

WebAutomation.ExecuteJavascript BrowserInstance:  Browser  \
                                Javascript:"""
                                function ExecuteScript() { 
                                btn = document.querySelectorAll("button[class*='mat-focus-indicator mat-paginator-navigation-next']");
                                btn[0].click();                                
                                return "clicked";
                                }
                                """ \ 
                                Result=> Result
                                
Console.Write Message: Result

Code screenshot:

On this particular page, where the button is the first of its type (probably the only of its type), this JS works also:

WebAutomation.ExecuteJavascript BrowserInstance:  Browser  \
                                Javascript:"""
                                function ExecuteScript() { 
                                btn = document.querySelector("button[class*='mat-focus-indicator mat-paginator-navigation-next']");
                                btn.click();                                
                                return "clicked";
                                }
                                """ \ 
                                Result=> Result

Remember, “class” is not the only attribute you can use. Another attribute of this element (i.e. “button”) is “aria-label”. So this works as well (and involves less typing). :grinning:

WebAutomation.ExecuteJavascript BrowserInstance:  Browser  \
                                Javascript:"""
                                function ExecuteScript() { 
                                btn = document.querySelectorAll("button[aria-label*='Next']");
                                btn[0].click();                                
                                return "clicked";
                                }
                                """ \ 
                                Result=> Result

Regards,
burque505