URL Provider
The Extractor should have url provider. The Extractor sequentially loads web pages from provided URLs, then capture and store data into some data storage.
When you creating a new Extractor you can see the following dialog:
It contains several URL providers, like:
URL List
The Extractor will process list of URLs. You can load URLs from CSV/Text or Excel file, paste from clipboard or add them manually.
URL File
The URLs are stored in separate CSV/Text or Excel file. Different settings may be available depending on the type of file. If you select CSV/Text file you can configure column separator. Then you choose Excel file you will need to select sheet that contains URLs column.
If the file contains more than one column, you need to click on column with URLs. The selected column will be highlighted in green.
URL Generator
The target URLs may have same structures with continuously increasing parameter like:
https://demo.websundew.io/ecommerce/details?id=1
https://demo.websundew.io/ecommerce/details?id=2
...
https://demo.websundew.io/ecommerce/details?id=10000
The URL Generator can generate this sequence based on template and specified range (from, to and increasing step)
Local Files
Extractor can be configured to extract data from web pages stored local on your computer. You need to add one or several folders that contains such files. Also you can configure filter to allow/exclude some files based on comma separated wild card patterns. Enable to Recursive to allow Extractor visit all files inside subfolder.