Beautiful Soup Python Supported?

SRQAZ13 · July 25, 2023, 2:01pm

Hi all,
I used the module BeautifulSoup in my python code, but I doubt that will transition to working on code used for Tidbyt. Is there something similar that is supported to scrape a web page and display on Tidbyt? Thanks in advance.

Tavis_Gustafson · July 26, 2023, 2:58am

You’ll need to use the regular expressions (re) module to parse the data from your scraped web page. Check out my noaabuoy app here for some examples :

#data_string = xpath.loads(xml).query("/rss/channel/item/description")
        # continue with parsing build up the list
        re_dict = dict()

        # coordinates, not used for anything yet
        re_dict["location"] = r"Location:</strong>\s+(.*)<b"

        # swell data
        re_dict["WVHT"] = r"Significant Wave Height:</strong> (\d+\.?\d+?) ft<br"
        re_dict["DPD"] = r"Dominant Wave Period:</strong> (\d+) sec"
        re_dict["MWD"] = r"Mean Wave Direction:</strong> ([ENSW]+ \(\d+)&#176;"

        # wind data
        re_dict["WSPD"] = r"Wind Speed:</strong>\s+(\d+\.?\d+?)\sknots"
        re_dict["GST"] = r"Wind Gust:</strong>\s+(\d+\.?\d+?)\sknots"
        re_dict["WDIR"] = r"Wind Direction:</strong> ([ENSW]+ \(\d+)&#176;"

        # temperatures
        re_dict["ATMP"] = r"Air Temperature:</strong> (\d+\.\d+?)&#176;F"
        re_dict["WTMP"] = r"Water Temperature:</strong> (\d+\.\d+?)&#176;F"

        # misc other data
        re_dict["DEW"] = r"Dew Point:</strong> (\d+\.\d+?)&#176;F"
        re_dict["VIS"] = r"Visibility:</strong> (\d\.?\d? nmi)"
        re_dict["TIDE"] = r"Tide:</strong> (-?\d+\.\d+?) ft"

SRQAZ13 · July 26, 2023, 11:24am

Thank you! I will test this out on my side.

dinosaursrarr · August 2, 2023, 7:51pm

Someone has made a beautiful soup module for starlark. It could also be possible to add that to pixlet so it can be used. I’d certainly find it useful aside to work with than the html module. Wouldn’t be a quick fix though.

dinosaursrarr · August 3, 2023, 12:59pm

Have sent in a PR to make the beautiful soup-like API available

DouweM · August 5, 2023, 6:19pm

Check out Pixbyt (see Announcing Pixbyt: a self-hosted app server for advanced apps), which comes with a html.xpath function that lets you use XPath to extract what you need from a web page!

dinosaursrarr · August 9, 2023, 6:17pm

The stock pixlet tool already has an xpath module, plus an html module that uses a jquery-like syntax.