We recommend using the Rocker version-stable images. The R base image does not keep older R versions, so the upgrades are not under your control. If you want to use the same environment as in transformations, use our image.
We recommend that you follow the guidelines for the R transformation. The standard R functions for CSV files work without problems:
You can also use the
write_csv function from the readr packages. It is faster.
KBC’s R component package provides functions to
The library is a standard R package that is available by default in the production environment.
Ready for use on GitHub, it can be installed locally with
devtools::install_github('keboola/r-docker-application', ref = 'master').
Use the library to read a user-supplied configuration parameter ‘myParameter’:
The library contains a single RC class
DockerApplication; the parameter of the constructor is the path to the data directory.
After that you can call
readConfig() to actually read and parse the configuration file, and then read the
myParameter parameter from the user-supplied configuration:
When the application is initialized
app <- keboola.r.docker.application::DockerApplication$new(), it read the configuration file from the constructor
argument, if no argument is provided, the
KBC_DATADIR environment variable is used.
You can obtain inline help and the list of library functions by running the
In our tutorial, we show components which have names of their input/output tables hard-coded. The following example shows how to read the input and output mapping specified by the end user, which is accessible in the configuration file. It demonstrates how to read and write tables and table manifests. File manifests are handled the same way. For a full authoritative list of items returned in table list and manifest contents, see the specification.
Note that the
destination label in the script refers to the destination from the
mapper perspective. The input mapper takes
from the user’s storage and produces
destination tables that become the input of the component. The output tables
of the component are consumed by the output mapper whose
destination are the resulting tables in Storage.
To test the code, set an arbitrary number of input/output mapping tables. Keep in mind to set the same number of inputs and outputs. The names of the CSV files are arbitrary.
In R components, the outputs printed in rapid succession are sometimes joined into a single event; this is a known behavior of R and it has no workaround. See a dedicated article if you want to implement a GELF logger.