Configurations are an important part of a Keboola project. Most operations are available in the UI. Use the API if you want to manipulate the configurations programmatically.
Configurations represent component instances in a project. Each Keboola component has different configuration options and requirements, which must be respected. As such, Keboola configurations provide a general framework for configuring components, while the specific implementation details are left to the components themselves.
When working with the Component Configurations API,
you need to know the componentId
of the component being configured.
You can see a list of public components in the Developer Portal, or you can get
a list of all available components with the API index call.
See our example.
It will give you something like this:
From here, you can see all available information about a particular component. In the following examples, we
will use keboola.ex-aws-s3
— the AWS S3 extractor.
Component configurations are largely dependent on the actual component being configured. This makes creating configurations manually a bit tricky. Rather than starting from scratch, we recommend creating a configuration through the UI and then modifying it when you understand it.
To obtain an existing configuration, you can either use the list of configurations above or
the Configuration Detail
API call. See an example for obtaining a
configuration of the keboola.ex-aws-s3
component. You will receive a response similar to this:
The actual component configuration is split into three parts:
configuration
node, containing an arbitrary component configurationstate
node, containing a component state filerows
node, containing iterations of configuration
and state
The important part is the ID of the configuration you want to work with. In the following examples, we will use
364479526
.
The configuration
node maps to the configuration file.
It can contain the storage
, parameters
, processors
and authorization
child nodes (the image_parameters
and action
nodes found in the config file
are injected at runtime and are not stored in the configuration). The authorization
node is set in the configuration only when
credentials injection should be used, otherwise it is also set during the runtime.
The processors
node defines the processors and their configuration.
The most common sub-nodes stored in the configuration
node are therefore parameters
(containing an arbitrary component configuration)
and storage
(containing input and output mapping).
Both are transferred to the
configuration file without modification; that means that the storage
configuration
is directly usable in the configuration
node. The parameters
node is fully dependent on the component and has no universal specification or rules.
In the above example, the configuration
node contains the following:
That means that the component is not using input mapping nor output mapping. The allowed contents of parameters
are described
in the AWS S3 extractor code documentation.
The rows
node contains iterations of the configuration. The interpretation of configuration rows is again dependent on the
component implementation. In the presented case of the keboola.ex-aws-s3
component, each row corresponds to a single extracted table.
When rows
node is non-empty, the component behavior is slightly modified. It behaves as if it were executed as many times as
there are rows. For each row, the configuration
node from root
and the configuration
node from rows
are merged, with
the latter overwriting the former in the case of conflict.
Given the above configuration, the effective configuration passed to the component configuration file will be as follows:
The first two parameters (accessKeyId
and #secretAccessKey
) are taken from the root configuration
, the other
parameters are taken from the first rows’ configuration
. The processors
node is never passed to the configuration file.
With the above configuration, the component will be executed only once, because there is one row. If there are no rows, the
component will still be executed once. If there were two rows, the component would be executed twice.
If the component is executed more than once, the operations are executed in the following order:
All of these are executed in a single job. However, even though multiple rows are executed in a single
job, the actual executions are still completely isolated. I.e., there is no way to share anything between the rows
(apart from the common configuration
). It also means that the outputs of the first row are available in the Keboola project before
the second row starts, and the inputs for the second row are read only after the first row finishes processing.
What is considered ‘first’ and ‘second’ – i.e. the order of rows – is defined by the order of items in the rows
array.
See below for an example of modifying the row order.
Theoretically, configuration rows are supported for every component as long as the effective configuration matches what the component expects. Configuration rows can be used to split the configuration into a common part (typically credentials) and an iterable part which is repeated many times. Keep in mind that configurations heavily modified through the API might not be supported in the UI.
The state
node contains the content of the state file. The
state
is read from the state file and then supplied to the state file on the next run. In the above configuration,
the state is:
State
is considered an internal property of a component and you should avoid modifying it. The only reasonable modification of
state
is to delete it – in that case, the configuration will run as if it were run for the first time. To delete the state
, set it to {}
.
If configuration rows are used, then the state
is stored separately for each row and the state
node in configuration root is
not used.
Here, the most common operations done with configurations are described in examples. Feel free to go through the API reference for a full authoritative list of configuration features.
To obtain configuration details, use the List Configs call, which will return all the configuration details. This means
configuration
) — section on configuration follows;rows
) — additional data of the configuration; andstate
) — component state.Please note that the contents
of the configuration
, rows
and state
sections depend purely on the component itself. See an example.
A sample result for the AWS S3 extractor looks like this:
Note: Configurations modified through the API might not be editable in the Keboola UI. They can be run or used in an orchestration without any problems.
Modifying a configuration means that a new version of that configuration is created. For modifying a configuration, use the Update Configuration API call. See an example in which the configuration is modified to the following to set new credentials:
Notice that the configuration must be sent in the form field configuration
as the endpoint does not accept pure JSON (yet).
Take great care to pass only the contents of the configuration
node as in the above example. The configuration must not be wrapped in the
configuration
node, otherwise the component will not
receive the configuration it expects. Also take care to properly escape the JSON using URL encoding,
otherwise it may be misinterpreted. The raw HTTP request should look similar to this:
curl --request PUT \
--url https://connection.keboola.com/v2/storage/components/keboola.ex-aws-s3/configs/364479526 \
--header "Content-Type: application/json" \
--header 'X-StorageAPI-Token: ' \
--data-binary "{
\"configuration\": {
\"parameters\": {
\"accessKeyId\": \"a\",
\"#secretAccessKey\": \"b\"
}
}
}"
Also note that the entire configuration must be always sent, there is no way to patch only part of it.
The same way the configuration
is modified, other properties can be modified too. For example, you may want to
reset state
by setting it to {}
, or you can change the order of the configuration rows by setting the rowsSortOrder
property.
The rowsSortOrder
is an array of row ids – see an example (Set Row order of S3 extractor)
for the exact example request.
Very similar to modifying a configuration, modifying a configuration row means that a new version of the entire configuration is created. For modifying a configuration row, use the Update Row API call.
See an example in which the configuration row is modified to:
The rules for updating a configuration row are the same as for updating a configuration. Also note that
a configuration row is never evaluated alone, it is always merged with the root configuration
. If the same properties are defined
in the root configuration
and row configuration
, the values from the row are used. There is also an
example of how to reset the row
state by setting state
to {}
.
When you update a configuration, a new configuration version is actually created. In the above calls, only the last (active/published) configuration is returned. To obtain a list of all recorded versions, use the List Versions API call. See this example which would give you an output similar to the one below:
The field version
represents the version_id
in the following API example.
After choosing a particular version, you can revert to that version by
rolling back,
i.e., making a new version identical to the chosen one. See an example
of how to rollback the configuration 364479526
of the keboola.ex-aws-s3
component to version 3
.
It will create a new version of the configuration and return the ID of the version:
After choosing a particular version, you can create a new independent
configuration copy
of it. See an example
of how to create a new configuration called test-copy
from version 3
of the 364479526
configuration
for the keboola.ex-aws-s3
component.
It will return the ID of the newly created configuration: