Hi all,
I used the module BeautifulSoup in my python code, but I doubt that will transition to working on code used for Tidbyt. Is there something similar that is supported to scrape a web page and display on Tidbyt? Thanks in advance.
You’ll need to use the regular expressions (re) module to parse the data from your scraped web page. Check out my noaabuoy app here for some examples :
#data_string = xpath.loads(xml).query("/rss/channel/item/description")
# continue with parsing build up the list
re_dict = dict()
# coordinates, not used for anything yet
re_dict["location"] = r"Location:</strong>\s+(.*)<b"
# swell data
re_dict["WVHT"] = r"Significant Wave Height:</strong> (\d+\.?\d+?) ft<br"
re_dict["DPD"] = r"Dominant Wave Period:</strong> (\d+) sec"
re_dict["MWD"] = r"Mean Wave Direction:</strong> ([ENSW]+ \(\d+)°"
# wind data
re_dict["WSPD"] = r"Wind Speed:</strong>\s+(\d+\.?\d+?)\sknots"
re_dict["GST"] = r"Wind Gust:</strong>\s+(\d+\.?\d+?)\sknots"
re_dict["WDIR"] = r"Wind Direction:</strong> ([ENSW]+ \(\d+)°"
# temperatures
re_dict["ATMP"] = r"Air Temperature:</strong> (\d+\.\d+?)°F"
re_dict["WTMP"] = r"Water Temperature:</strong> (\d+\.\d+?)°F"
# misc other data
re_dict["DEW"] = r"Dew Point:</strong> (\d+\.\d+?)°F"
re_dict["VIS"] = r"Visibility:</strong> (\d\.?\d? nmi)"
re_dict["TIDE"] = r"Tide:</strong> (-?\d+\.\d+?) ft"
Thank you! I will test this out on my side.
Someone has made a beautiful soup module for starlark. It could also be possible to add that to pixlet so it can be used. I’d certainly find it useful aside to work with than the html module. Wouldn’t be a quick fix though.
Have sent in a PR to make the beautiful soup-like API available
Check out Pixbyt (see Announcing Pixbyt: a self-hosted app server for advanced apps), which comes with a html.xpath
function that lets you use XPath to extract what you need from a web page!
The stock pixlet tool already has an xpath module, plus an html module that uses a jquery-like syntax.