This tutorial will show you how to work with artifacts. In the following example we will use Python Transformation component to produce and consume artifacts. But these principles would work inside any component.
In the examples, we use the curl
console tool to interact with our APIs.
Note: artifacts
feature needs to be enabled in your project. Please contact support@keboola.com to enable the feature in your project
Note 2: artifacts
configuration can be created or edited only via Configuration API for now
For each example we will need Storage API Token to make the API call.
Store the token and url to the environment variable.
export STORAGE_API_HOST="https://connection.keboola.com"
export TOKEN="..."
This is very simple example. We will just create a Python Transformation, which will write a file to the artifacts “upload” folder. This file will be then uploaded as “artifact” to File Storage.
import os
with open("/data/artifacts/out/current/myartifact1", "w") as file:
file.write("this is my artifact file content")
Run the transformation - it should upload the file to File Storage as “artifact”
The file is now visible in File Storage with appropriate tags
To consume (download) artifacts for component to work with, we need to enable and configure artifacts download in the configuration of a component.
We will create another configuration of the Python transformation via API.
The artifacts part of the configuration will look like this.
It will enable download of artifacts of type runs
with limit 5, which means this will download artifacts created by the last 5 runs of the same component configuration
{
"artifacts":{
"runs":{
"enabled":true,
"filter":{
"limit":5
}
}
}
}
The script of the transformation will look like following.
Files read from /data/artifacts/in/runs/*/*
will be displayed at output - these are the artifact files downloaded.
The script will also generate a new artifact and write it to /data/artifacts/out/current/myartifact1
as in previous example.
```python
import os
import glob
# Download
print(glob.glob("/data/artifacts/in/runs/*/*"))
# Upload
with open("/data/artifacts/out/current/myartifact1", "w") as file:
file.write("value1")
``` 1. Run this curl command to create the configuration:
```shell
curl -X POST "$STORAGE_API_HOST/v2/storage/branch/default/components/keboola.python-transformation-v2/configs" \
-H "X-StorageApi-Token: $TOKEN" \
-H 'Content-Type: application/x-www-form-urlencoded' \
--data-urlencode 'configuration={"parameters":{"blocks":[{"name":"Block 1","codes":[{"name":"artifacts","script":["import os\nimport glob\n\n# Download\nprint(glob.glob(\"/data/artifacts/in/runs/*/*\")) \n\n# Upload\nwith open(\"/data/artifacts/out/current/myartifact1\", \"w\") as file:\n file.write(\"value1\")"]}]}]},"artifacts":{"runs":{"enabled":true,"filter":{"limit":5}}}}' \
--data-urlencode 'name=Artifacts upload & download' \
--data-urlencode 'description=Test Artifacts upload & download'
```
Similar to previous example we will create a configuration of Python Transformation component.
But this time we will download artifacts produced by the configuration from Example 2
.
export CONFIG_ID="..."
curl -X POST "$STORAGE_API_HOST/v2/storage/branch/default/components/keboola.python-transformation-v2/configs" \
-H "X-StorageApi-Token: $TOKEN" \
-H 'Content-Type: application/x-www-form-urlencoded' \
--data-urlencode 'configuration={"parameters":{"blocks":[{"name":"Block 1","codes":[{"name":"artifacts","script":["import os\nimport glob\n\n# Download\nprint(glob.glob(\"/data/artifacts/in/custom/*/*\"))"]}]}]},"artifacts":{"custom":{"enabled":true,"component_id":"keboola.python-transformation","config_id":"$CONFIG_ID","branch_id":"default","filter":{"limit":5}}}}' \
--data-urlencode 'name=Artifacts upload & download' \
--data-urlencode 'description=Test Artifacts upload & download'
The whole configuration now looks like this:
{
"parameters": {
"blocks": [
{
"name": "Block 1",
"codes": [
{
"name": "artifacts",
"script": [
"import os\nimport glob\n\n# Download\nprint(glob.glob(\"/data/artifacts/in/custom/*/*\"))"
]
}
]
}
]
},
"artifacts": {
"custom": {
"enabled": true,
"component_id": "keboola.python-transformation",
"config_id": "$CONFIG_ID",
"branch_id": "default",
"filter": {
"limit": 5
}
}
}
}
This example will show how to share artifacts within an orchestration We will create two configurations of Python Transformation component. One will produce a shared artifact and the other will consume it. Both configurations needs to be in the same orchestration. The configuration producing artifact needs to be in a phase that precedes the consuming one.
Create “Producer” configuration The Python code will write a file into a shared folder:
import os
with open(path+\"/myartifact-shared\", \"w\") as file:
file.write(\"value1\")"
Run curl command to create the configuration:
curl -X POST "$STORAGE_API_HOST/v2/storage/branch/default/components/keboola.python-transformation-v2/configs" \
-H "X-StorageApi-Token: $TOKEN" \
-H 'Content-Type: application/x-www-form-urlencoded' \
--data-urlencode 'configuration={"parameters":{"blocks":[{"name":"Block 1","codes":[{"name":"Upload shared","script":["import os\npath = \"/data/artifacts/out/shared\"\nwith open(path+\"/myartifact3\", \"w\") as file:\n file.write(\"value1\")"]}]}]},"artifacts":{"runs":{"enabled":true,"filter":{"limit":5}}}}' \
--data-urlencode 'name=Artifacts shared Producer' \
--data-urlencode 'description=Artifacts upload shared'
Create “Consumer” configuration
The artifacts configuration:
{
"artifacts": {
"shared": {
"enabled": true
}
}
}
The Python script:
import os
import glob
print(glob.glob("/data/artifacts/in/shared/*/*"))
Run curl command to create the configurtion:
curl -X POST "$STORAGE_API_HOST/v2/storage/branch/default/components/keboola.python-transformation-v2/configs" \
-H "X-StorageApi-Token: $TOKEN" \
-H 'Content-Type: application/x-www-form-urlencoded' \
--data-urlencode 'configuration={"parameters":{"blocks":[{"name":"Block 1","codes":[{"name":"Download shared","script":["import os\nimport glob\n\nprint(glob.glob(\"/data/artifacts/in/shared/*/*\")) "]}]}]},"artifacts":{"shared":{"enabled":true}}}' \
--data-urlencode 'name=Artifacts shared Consumer' \
--data-urlencode 'description=Artifacts download shared'
Now put each of the configurations into an Orchestration. “Artifacts shared Producer” into phase 1 and “Artifacts shared Consumer” into phase 2.