Note: This is a preview feature and as such may change considerably in the future.
Variables are placeholders used in configurations. Their value is resolved at job runtime.
Important: Make sure you’re familiar with the Configuration API and the Job API before reading on.
See Tutorial for step-by-step example.
When using variables, the configuration is treated as a Moustache template.
You can enter variables anywhere in the JSON of the configuration body. The configuration body is the contents of
the configuration
node when you retrieve a configuration.
This means that you can’t use variables in a name or in a configuration description.
Variables are entered using the Moustache syntax,
i.e., {{ variableName }}
. To work with variables, three things are needed:
keboola.variables
component.To enable replacement of variables, the main configuration has to reference the variable configuration.
If there is no variable configuration referenced, no replacement is made (the main configuration is completely
static). Variables can be used in any place of any configuration except legacy transformations (the component with
the ID transformation
; it can still be used in a specific transformation – e.g., keboola.python-transformation-v2
or keboola.snowflake-transformation
, etc.), and an orchestrator (see below).
A variable configuration is a standard configuration tied to a special dedicated keboola.variables
component.
The variable configuration defines names of variables to be replaced in the main configuration. You can create
the configuration using the
Create Configuration API call.
This is an example of the contents of such a configuration:
Note that type
is always string
.
When you create a variable configuration, you’ll obtain an ID of the configuration - e.g., 807940806
.
In the main configuration, you have to reference the variable configuration ID using the variables_id
node.
Then you can use the variables in the configuration body:
You can either store the variable values as configuration rows of the variable configuration and provide the row ID of the stored values at run time, or you can provide the variable values directly at run time. There are three options how you can provide values to the variables:
variables_values_id
property in the main configuration (default values).variablesValuesId
property in job parameters.variableValuesData
property in job parameters.The structure of variable values, regardless of whether it is stored in configuration or provided at runtime, is as follows:
The default variable delimiter is {{
and }}
. If the delimiter interferes with your code, it can
be changed as per the Moustache docs. For example the
following piece of code
will be interpreted as (assuming the variables alias=batman
and size=big
are defined):
In this example, we will configure a Python transformation using variables. Note that we are using the new transformations and bypassing the current Transformation Service. Such configurations are not supported in the UI yet.
Use the Create Configuration API call
for the keboola.variables
component with the following content:
See an example.
Note that this step is optional – you can use variables without default values.
In the previous step, you obtained an ID of the variable configuration. Use the
Create Configuration Row API call.
Use the ID of the variable configuration and keboola.variables
as a component. Use the following body:
See an example
Now it is time to create the actual configuration which will contain a Python transformation.
Use the following configuration body. The storage
section describes the standard input
and output mapping.
The variables_id
property contains the ID of the variable configuration - e.g., 807968875
. The
variables_values_id
property is optional and contains the ID of the row with default values - e.g., 807952812
.
The parameters
section contains a script with the following Python code:
The script reads a file given by the alias, modifies the two columns COUNTRY and CARS, and prints the contents of the configuration file to output.
See an example.
There are three options for providing variable values when running a job:
variablesValuesId
property in job parametersvariableValuesData
property in job parametersFollowing the rules for running a job, you always have to provide values for the defined variables.
Note that it is important which variables are defined in the variable configuration, not which
variables you actually use in the main configuration. For example, the main configuration references a variable
configuration with firstVar and secondVar variables, but you’re using {{ firstVar }}
and
{{ thirdVar }}
in the configuration code. Then you have to provide values at least for firstVar
and secondVar variables. If you provide values for all firstVar, secondVar, and thirdVar, all of them will
be replaced. If you omit thirdVar, it will be replaced by an empty string. If you omit one of firstVar,
secondVar, an error will be raised.
The second rule is that the three options of passing values are mutually exclusive. If you provide values using
variablesValuesId
or variableValuesData
, it overrides the default values (if provided). You can’t use
variablesValuesId
and variableValuesData
together in a single call. If you do that, an error will be raised.
If no default values are set and none of the variablesValuesId
or variableValuesData
is provided, an error
will be raised.
If you created the default values, you can now directly run the job. Use the Create Job API call with the following body:
The config
property contains the ID of the main configuration.
Before executing the API call, you have to create the source table. Unless you modified the mapping in the
example, you have to create a bucket named
variable-testing in the in stage. Then create a table called batman with columns COUNTRY
and CARS. You can use this sample CSV file.
After you create the input table, you can run the job. See an example. It will create a new table in Storage – out.c-variable-testing.cars. The tables should contain the default values, e.g.:
COUNTRY | CARS |
---|---|
Belgiumbatman | 629378142 |
Finlandbatman | 335823242 |
Italybatman | 4139387742 |
Romaniabatman | 654126042 |
The events of the job will contain the contents of the configuration file where you can verify that the variables were replaced.
Similarly to the default values, you can store another set of values. Let’s add another configuration row to the existing variable configuration:
See an example. You will obtain an ID of the row. Then create a table called watman with columns COUNTRY and CARS. You can use this sample CSV file.
Run a job with parameters and provide the ID of the main configuration in the config
property and
the ID of the value row in variableValuesId
:
See an example. The output table now contains:
COUNTRY | CARS |
---|---|
BelgiumWATMAN | 62937814200 |
FinlandWATMAN | 33582324200 |
ItalyWATMAN | 413938774200 |
The last option to provide the values for variables is to enter them directly when running a job.
Variable values are entered in the variableValuesData
property:
See an example.
The output table will contain:
COUNTRY | CARS |
---|---|
Belgiumbatman | 6293781scatman |
Finlandbatman | 3358232scatman |
Italybatman | 41393877scatman |
Variables in a configuration interact with an orchestrator in two ways:
Entering variable values in task configurations allows the orchestration to run configurations with variables.
Variable values are entered in the actionParameters
property. The parameters are identical to
running a job.
When running an orchestration, you can also provide variable values for an entire orchestration. In that case, the provided values will override those set in individual orchestration tasks. The parameters are identical to running a job.
You have to use the
Create Configuration API call
to create a configuration of the keboola.orchestrator
component.
You can use the following data in the configuration:
The contents of the task
property are identical to the body
of the run job API call. Here, the value 807968875
refers to the ID
of the main configuration, and 807952812
refers to the ID of the configuration row with variable values.
You can use the variableValuesData
field in the same manner.
Creating the above configuration will return a response containing the configuration ID, e.g., 807969959
.
See an example.
When running an orchestration which contains configurations referencing variables, you have to provide their values. You can either rely on the stored values (either at the component configuration or in the orchestration task) or you can provide the values at runtime.
Use the Run Job API call to run an orchestration. In its simplest form, the request body needs to contain just the ID of the orchestration (obtained in the previous step):
As long as the variable values can be found somewhere, this is sufficient. See an example.
Use the Run Job API call
to run an orchestration. Additionally, you can use the variableValuesId
or variableValuesData
property
to override variable values set to individual tasks. The calling convention is the same as shown in the
basic job run. The same rules also apply, notably that you can’t
use variableValuesId
and variableValuesData
together.
A sample request body:
See an example.
There is a number of places where variable values can be provided (either as a reference to an existing row with
values or as an array of values
):
task
setting of the orchestrationvariables_values_id
property)The following diagram shows the parameters mentioned on this page and to what they refer to:
In a nutshell, variableValuesId
always refers to the row of the variable configuration associated with the
main configuration. The main configuration is referenced in the config
parameter. From another point of view,
the config
parameter represents the configuration (either a component or an orchestration) to be run.
Note that in stored configurations snake_case is used instead of camelCase.
The following rules describe the evaluation sequence:
task
.task
override values stored in the component configuration.variableValuesData
and variableValuesId
can’t be used together, so neither of them takes precedence. A reference to stored values can’t be mixed with providing the values inline.Related to variables is the Shared Code feature. Shared code allows to share parts of configuration code. In a
configuration it is also replaced using the Moustache syntax. Shared code
is referenced using shared_code_id
and shared_code_row_ids
configuration nodes. Unlike variables, shared code can’t
be overridden at runtime (so there are no parameters to set when running a job or in orchestration).
Shared code can, however, contain its own variables which need to be merged to those of the main configuration.
Shared code pieces is stored as configuration rows of a dedicated component keboola.shared-code
. Before creating a
piece of a shared code, you first have to create a configuration. Notice that the UI uses certain configurations for
certain components so you might want to check the existing configurations of keboola.shared-code
component before
crating a new configuration.
To create a configuration, use the create configuration API call. The configuration content is ignored, i.e all you need to provide is name:
Let’s assume that the created configuration ID is 618884794
.
Next step is to create the shared code piece itself. To do this create a configuration row of the above configuration
with the configuration row content containing a piece of share code, for example:
It is advisable to set a reasonable rowId
of the row, because it will be used later to reference the shared code:
The above example creates a piece of shared python code named dumpfiles
which contains the
following python code:
To use a piece of shared code, you have to reference it in a configuration using shared_code_id
which is the ID of the shared code configuration and shared_code_row_ids
which is an array of IDS of shared code pieces. With the above example you need to add the following nodes to the configuration:
With that all moustache references to {{ dumpfiles }}
will be replaced by the shared code piece. All other
moustache references will be kept untouched and be treated like variables. E.g: the following configuration:
Will be modified to:
The variables then need to contain someOtherPlaceholder
variable in order to produce a fully functional configuration.
The same way if the shared code piece contains any variables, they have to be set when running the configuration.
Important: The replacement of the shared code piece occurs only within an array of the configuration JSON. In the above code, the shared code reference is "script": ["{{ someOtherPlaceholder }}"]
which is the only valid form of a Shared Code reference. For example
"script": ["some code {{ someOtherPlaceholder }} some other code"]
or "script": "{{ someOtherPlaceholder }}"
are invalid Shared Code references which may not be replaced the way you intend.
Important: The replacement of the shared code piece merges the code_content
array containing the shared code definition with the array containing the shared code reference. With a shared code reference in form "script": ["a", "{{ someOtherPlaceholder }}", "b"]
and shared code definition in form "code_content": ["c", "d"]
the resulting replacement would be "script": ["a", "c", "d", "b"]
.