Keboola Table Storage (Tables) and Keboola File Storage (File Uploads) are heavily connected together.
Keboola File Storage is technically a layer on top of the Amazon S3 service, and Keboola Table
Storage is a layer on top of a database backend.
To upload a table, take the following steps:
Request a file upload from
Keboola File Storage. You will be given a destination for the uploaded file on an S3 server.
Upload the file there. When the upload is finished, the data file will be available in the File Uploads section.
Initiate an asynchronous table import
from the uploaded file (use it as the dataFileId parameter) into the destination table.
The import is asynchronous, so the request only creates a job and you need to poll for its results.
The imported files must conform to the RFC4180 Specification.
Exporting a table from Storage is analogous to its importing. First, data is asynchronously
exported from
Table Storage into File Uploads. Then you can request to download
the file, which will give you
access to an S3 server for the actual file download.
Manually Uploading a File
To upload a file to Keboola File Storage, follow the instructions outlined in the
API documentation.
First create a file resource; to create a new file called
new-file.csv with 52 bytes, call:
Which will return a response similar to this:
The important parts are: id of the file, which will be needed later, the uploadParams.credentials node,
which gives you credentials to AWS S3 to upload your file, and
the key and bucket nodes, which define the target S3 destination as s3://bucket/key.
To upload the files to S3, you need an S3 client. There are a large number of clients available:
for example, use the
S3 AWS command line client.
Before using it, pass the credentials
by executing, for instance, the following commands
on *nix systems:
or on Windows:
Then you can actually upload the new-table.csv file by executing the AWS S3 CLI cp command:
This will create an asynchronous job, importing data from the 192726698 file into the new-table destination table in the in.c-main bucket.
Then poll for the job results, or review its status in the UI.
Python Example
The above process is implemented in the following example script in Python. This script uses the
Requests library for sending HTTP requests and
the Boto 3 library for working with Amazon S3. Both libraries can be
installed using pip:
Upload Files Using Storage API Importer
For production setup, we recommend using the approach outlined above
with direct upload to S3 as it is more reliable and universal.
In case you need to avoid using an S3 client, it is also possible to upload the
file by a simple HTTP request to Storage API Importer Service.
Depending on the backend and table size, the data file may be sliced into chunks.
Requirements for uploading sliced files are described in the respective part of the
API documentation.
When you attempt to download a sliced file, you will instead obtain its manifest
listing the individual parts. Download the parts individually and join them
together. For a reference implementation of this process, see
our TableExporter class.
Important: When exporting a table through the Table — Export UI, the file will
be already merged and listed in the File Uploads section with the storage-merged-export tag.
If you want to download a sliced file, get credentials
to download the file from AWS S3. Assuming that the file ID is 192611596, for example, call
which will return a response similar to this:
The field url contains the URL to the file manifest. Upon downloading it, you will get a JSON file with contents
similar to this:
Now you can download the actual data file slices. URLs are provided in the manifest file, and credentials to them
are returned as part of the previous file info call. To download the files from S3, you need an S3 client. There
are a wide number of clients available; for example, use the
S3 AWS command line client. Before
using it, pass the credentials
by executing , for instance, the following commands
on *nix systems:
or on Windows:
Then you can actually download the files by executing the AWS S3 CLI cp command:
After that, merge the files together by executing the following commands