- Extending KBC
- Docker Runner
Docker Runner is a core KBC component which provides an interface for
running other KBC components. Every component in Keboola Connection is represented by a Docker image in Keboola Connection.
Developing functionality in Docker allows you to focus only on the application logic; all communication
with Storage API will be handled by Docker Runner. You can encapsulate any application into an Docker image
following a set of rules that will allow you to integrate the application into Keboola Connection.
There is a predefined interface with Docker Runner consisting mainly of a
folder structure and a serialized configuration file.
Custom Science, Docker Extensions and also
R and Python Transformations are all dockerized applications and are run using Docker Runner.
The Docker Runner functionality can be described in a few steps:
- Download and build the specified Docker image.
- Download all tables and files specified in the input mapping from Storage.
- Create a configuration file.
- Run before processors if there are any.
- Run the Docker image (create a Docker container).
- Run after processors if there are any.
- Upload all tables and files in the output mapping to Storage.
- Delete the container and all temporary files.
When the application execution is finished, Docker Runner automatically collects the exit code and the content of STDOUT and STDERR.
The following schema illustrates the workflow of running a dockerized component.
The application is responsible for these processes:
- Reading the configuration and source tables in CSV format and files (if specified).
- Writing the results to the predefined folders and files.
- Proper handling of success/error results by setting an appropriate exit code.
Docker Runner is responsible for the following processes:
- Authentication: Docker Runner makes sure the application is run by authorized users/tokens.
It is not possible to run an extension anonymously. The extension does not have an access to the KBC token
itself, and it receives only limited information about the project and end-user.
- Starting and stopping the extension: Docker Runner will boot a Docker container which contains the
extension. This ensures the extensions run in a precisely defined environment which is guaranteed to
be the same for each extension run. No application state is preserved (with the exception of the
- Reading and writing data to KBC Storage: Docker Runner ensures a custom extension
cannot access arbitrary data in the project. It will only receive the input mapping defined by the end-user;
and only those outputs defined in the output mapping by the end-user will be written to the project.
- Application isolation: Each extension is run in its own Docker container which is isolated from other
containers; the application cannot be affected by other running applications. It may also be limited
to have no network access.
For Custom Science, Docker Runner also creates the Docker image from the
specified git repository on the fly.
The Docker Runner API is described on Apiary.io. Docker Runner
has API calls to
Extensions executed by Docker Runner store their configurations in
Storage API Components Configurations.
When creating the configuration, use
this JSON schema
to validate the configuration before storing it.