Read Webpage
Reads the text present in a URL
The Read Webpage
module is a data module that can extract the text from a specific URL.
You can use this module to provide live contextual information from a specific web page to an AI model.
This module has multiple configurations that can enhance its capabilities.
- Depth of sub-links: The number of sub-links that will be read in each webpage
- Scroll the Webpage: When enabled, the model can smartly scroll the webpage to find more information
- Advanced Options:
- Continue on Error: If enabled, in case the scraping fails, the output value of Text will contain the value
<ERROR>
, instead of failing the workflow - Parse HTML to Markdown: If disabled, the output value of Text will contain the raw HTML, instead of the interpreted markdown version of the content
- Continue on Error: If enabled, in case the scraping fails, the output value of Text will contain the value
The Read Webpage
module has one input and two outputs:
- Input:
URL
, the link to the webpage you want to scrape - Output:
Pages
, the text that was extracted from the webpageLinks Found
, a list with all the links found on each page. If you want to transform each sequence of links in a page into a list, you can use the moduleSingle value to list