This tutorial guides you through the process of creating a simple Custom Science Application. The application logic is trivial: it takes a table with numbers as an input, and creates another table with an extra column containing those numbers multiplied by two. A test in KBC is included. The application is then extended to accept a parameter from the end-user.
You should have a KBC project, where you can test your code.
In the root of your repository, create the main application file
main.R. (In Python Custom Science App, the analogous file would be called
Create a source table in Storage, e.g.:
Name of the table in Storage is not important. Let’s name it in.c-main.custom-science-example. For instructions on how to create a table, go to KBC Tutorial. The output bucket and table will be created automatically.
Go to Applications – New Application – Custom Science R, and press Add configuration in which you will set the input and output mapping and repository as explained below.
To test the application, use the in.c-main.custom-science-example sample table as input. Make sure to set the input mapping name to source.csv – that is what we expect in the sample script.
The same goes for output mapping: make sure to map from result.csv (the result of your sample script) to whatever output table you want to, let’s say out.c-main.custom-science-example.
Leave File input mapping empty.
Leave parameters empty for now. In Runtime section enter the the configuration of the repository:
By running the above configuration, you should obtain a table out.c-main.custom-science-example with the following data:
You can pass the application an arbitrary set of parameters. As an example, we will extend the application from the previous tutorial by allowing the user to specify the multiplier.
# initialize application library('keboola.r.docker.application') app <- DockerApplication$new('/data/') app$readConfig() # read input data <- read.csv("/data/in/tables/source.csv"); # do something data['double_number'] <- data['number'] * app$getParameters()$multiplier # write output write.csv(data, file = "/data/out/tables/result.csv", row.names = FALSE)
Commit the code and don’t forget to create a new tag in the repository.
Enter the configuration in the parameters field:
Enter the repository in the runtime section:
Note that the configuration format is arbitrary and there is no validation. Implement parameter validation in your script, otherwise the end-user may receive confusing error messages.
The following screenshot summarizes all the necessary end-user configuration:
In the above example, we used static input/output mapping which means that the names of CSV files are hard-coded in the application script. There are two potential problems with this:
Depending on your use case this may or may not be a problem. In case you want to use dynamic input mapping, consult the development guide. If your application is more complex, go to Docker extensions.