Generic Extractor

Generic Extractor is a Keboola component that acts like a customizable HTTP REST client. It can be configured to extract data from virtually any sane web API.

Due to the versatility of different APIs running in the wild, Generic Extractor offers many configuration options.

You may opt to use the visual builder, which provides a very convenient way of configuring and testing the configuration. With it, you can build an entirely new extractor for Keboola in less than an hour.

Generic Extractor - UI

To get started quickly, follow our Generic Extractor tutorial.

Generic Extractor Requirements

Generic Extractor allows you to extract data from an API into Keboola only by configuring it. No programming skills or additional tools are required. You just need to do two easy things before you start:

  • Become familiar with JSON format.
  • Have the documentation of your chosen API at hand. The API should be RESTful and, more or less, follow the HTTP specification.

Configuration & Development

Again, if you are new to Generic Extractor, we strongly suggest you go through the Generic Extractor tutorial. It outlines the basic principles and the most important features.

With the new convenient user interface, you can set up and test the connection in a few clicks, just like you are used to in some other popular API development tools.

Features such as cURL import, request tests, output mapping generator, or dynamic function templates and evaluation make the configuration process as easy as ever.

If you intend to develop a more complicated configuration, check out how to run Generic Extractor locally. The documentation includes several examples that can also be run locally.

Publishing Generic Extractor Configuration

Each Generic Extractor configuration can be published as a new standalone component. However, for registration, configurations must be converted to templates.

Publishing your Generic Extractor configuration is not required. However, when published, it can be easily used in multiple projects. A great advantage of using templates is that they do not limit the configuration. You can always switch to JSON free-form configuration when necessary.

Also, templates can be used only with published components based on Generic Extractor configurations.

Generic Extractor Source

As with other Keboola components, the Generic Extractor connector is available on GitHub. Apart from the main repository, it uses some vital libraries (which partially define its capabilities):

  • Juicer — component responsible for processing HTTP JSON responses
  • CSV Map — library that converts JSON data into CSV tables
  • Filter — library that allows to match values together
  • JSON Parser — JSON parser which produces CSV tables while maintaining relations