URL Provider

The Extractor should have url provider. The Extractor sequentially loads web pages from provided URLs, then capture and store data into some data storage.

When you creating a new Extractor you can see the following dialog:

Extractor - URL Provider

It contains several URL providers, like:

URL List

The Extractor will process list of URLs. You can load URLs from CSV/Text or Excel file, paste from clipboard or add them manually.

Extractor - URL List provider

URL File

The URLs are stored in separate CSV/Text or Excel file. Different settings may be available depending on the type of file. If you select CSV/Text file you can configure column separator. Then you choose Excel file you will need to select sheet that contains URLs column.

If the file contains more than one column, you need to click on column with URLs. The selected column will be highlighted in green.

Extractor - URL File Provider

URL Generator

The target URLs may have same structures with continuously increasing parameter like:

https://demo.websundew.io/ecommerce/details?id=1
https://demo.websundew.io/ecommerce/details?id=2
...
https://demo.websundew.io/ecommerce/details?id=10000

The URL Generator can generate this sequence based on template and specified range (from, to and increasing step)

Extractor - URL Generator

Local Files

Extractor can be configured to extract data from web pages stored local on your computer. You need to add one or several folders that contains such files. Also you can configure filter to allow/exclude some files based on comma separated wild card patterns. Enable to Recursive to allow Extractor visit all files inside subfolder.

Extractor - URL Generator