Creating Sandbox

In order to run and debug KBC dockerized applications (including Custom Science, R and Python Transformations) on your own computer, you need to manually supply the application with a data folder and configuration file.

To create a sample data folder, use a Docker Runner API. There are three calls available:

  • Create sandbox — Use this call to obtain a sample environment configuration when starting with development of a new Docker extension or Custom Science extension (for components with/without registration).
  • Input data — Use this API call to obtain an environment configuration for a registered Docker extension (without encryption) or Transformations.
  • Dry run — This API call will do everything except the output mapping; use it to debug existing applications in production without modifying files and tables in your KBC project.

The API calls resolve and validate the input mapping and create a configuration file. Then they archive the whole /data/ folder and upload it to your KBC project. None of these API calls write any tables or files other than the archive, so they are very safe to run.

The body structure of the first two API calls is the same. Before you start, you need a KBC project. We recommend that you use Apiary or Postman to call the API. A collection of examples of the Sandbox API calls is available in Postman Docs.

Create Sandbox API Call

This call can be used for developing new extensions. Registration is not required.

Prepare

Create a table in KBC Storage which contains a column named number. You can use our sample table.

In the following example, the table is stored in the in.c-main bucket and is called sample. The table ID is therefore in.c-main.sample.

Storage Screenshot

You also need a Storage API token.

Send API Request

In the collection of sample requests, there is an Introduction example with the the following JSON in its body:

{
    "configData": {
        "storage": {
            "input": {
                "tables": [
                    {
                        "source": "in.c-main.sample",
                        "destination": "source.csv"
                    }
                ]
            }
        },
        "parameters": {
            "multiplier": 4
        }
    }
}

The node configData.storage.input.tables.source refers to the existing table ID (the table created in the previous step) in Storage. The configData.storage.input.tables.destination node refers to the destination to which the table will be downloaded for the application; it will therefore be the source for the application.

For registered components with a UI, the entire configData.storage node is generated by the UI. The node parameters contains arbitrary parameters which are passed to the application.

The URL of the request is https://syrup.keboola.com/docker/sandbox. The request body is in JSON. Enter your Storage API token into X-StorageAPI-Token header and run the request.

Getting Result

When running the request with valid parameters, you should receive a response similar to this:

{
    "id": "176883685",
    "url": "https://syrup.keboola.com/queue/job/176883685",
    "status": "waiting"
}

This means an asynchronous job which will prepare the sandbox has been created. If curious, view the job progress under Jobs in KBC:

Job progress screenshot

The job will be usually executed very quickly, so you might as well go straight to StorageFiles in KBC. There you will find a data.zip file with a sample data folder. You can now use this folder with your Docker extension or Custom Science.

Input Data API Call

The input API call differs in that it must be used with an existing component. It requires componentId obtained from the component registration. This also means that this call can be used both with existing configurations as well as ad-hoc configurations (as in the above sandbox request).

Prepare

We assume you have the same in.c-main.test source table as in the previous request. You can then create configuration for our sample component keboola.docs-docker-example-parameters by visiting the following URL:

https://connection.keboola.com/admin/projects/{projectId}/applications/keboola.docs-docker-example-parameters

Where you replace {projectId} with the ID of the project in KBC (you can find that in the URL). Then create the configuration. The equivalent to what we have used in the above Sandbox call would be as follows:

Configuration screenshot

Run the API Request

When you created the configuration, it was assigned a configuration ID — 328831433 — in our example. Use this ID instead of manually crafting the request body.

You can see an introduction sample request in our collection of requests.

The following is the request body:

{
    "config": "328831433"
}

Where you need to replace 328831433 with your own configuration ID. The request URL is as follows:

https://syrup.keboola.com/docker/keboola.docs-docker-example-parameters/input/

Where keboola.docs-docker-example-parameters is the component ID (you can replace that with your own component if you like). Again, do not forget to enter your Storage API token into the X-StorageAPI-Token header.

As with the sandbox call, running the API call will create a job which will execute and produce a data.zip file in StorageFiles.

Important: If you actually want to run the above 328831433 configuration, you also need to set the output mapping from destination.csv to a table.

Summary

  • For unregistered components, use the sandbox call:
    • The whole configuration must be passed as the body (configData node) of the API call in JSON format.
    • The source data is limited to 50 rows.
  • For registered components, use the input data call:
    • The configuration can be either passed as the body (configData node), or it can refer to an existing configuration (config node).
    • The source data is exported unlimited — this can lead to large data folders!
  • Both the sandbox and input calls create a job (automatically executed) which produces a data.zip file in your StorageFile Uploads:
    • The data.zip folder can be extracted and mapped to your dockerized application.
    • The data.zip contains the input tables and files, their manifests and configuration file.
    • The data.zip does not contain any data in the out folder, your application has to produce it.