Native Types provide a structured way for components to define their handling of data types, referred to as “Native Types.”
The level of type handling is specified by the dataTypeSupport property, which can take one of three values:
This design overcomes limitations in current settings, where all components automatically produce typed tables when a project switches to Native Types. For instance, Data Sources that output unreliable type hints (e.g., an int column containing values like N/A) can now explicitly signal their limitations, reducing downstream issues.
An output table manifest sets options for transferring a table to Storage. The following examples list available
manifest fields; all of them are optional. The destination
field overrides the table name generated
from the file name; it can (and commonly is) overridden by the end-user configuration.
The table_metadata
fields allow you to set
Metadata for the table.
The table_metadata
field corresponds to the Table Metadata API call.
The key
and value
of the object are passed directly to the API; the provider
value is
filled by the Id of the running component (e.g., keboola.ex-db-snowflake
).
Additionally, the following options will cause the specified rows to be deleted from the source table before the new table is imported. See an example. Using this option makes sense only with incremental loads.
The schema
[optional] field allow you to create a table with Native Data Types columns.
Each object in the schema
array represents one column:
name
[required] field specifies the column name.data_type
[optional] field defines the data type for different storage backends, referred to as “Native Types”.
base
[required] type is always required and can have values specified in the Base Types documentation.nullable
[optional] field indicates if the column can be null.primary_key
[optional] field specifies if the column is a primary key.description
[optional] field provides a description of the column.metadata
[optional] field allows setting additional metadata for the column.Source data types are mapped to a destination using a base type. The current base types are
STRING
, INTEGER
, NUMERIC
, FLOAT
, BOOLEAN
,
DATE
, and TIMESTAMP
. This means that, for example, a MySQL extractor
may store the value BIGINT
as a type of column; that type maps to the INTEGER
base type. When the Snowflake writer consumes this value, it will
read the base type INTEGER
and choose a corresponding type for Snowflake, which happens to be also INTEGER
.
This ensures high interoperability between the components. Please take a look at the conversion table below.
View the extracted data types in the storage tables detail:
You can also override the data type:
When you use the table (e.g., in the Snowflake writer), you’ll see the data type you have configured:
The data is converted only when writing or copying (e.g., to a transformation or a writer). That means that you can extract an integer column, mark it as a timestamp in storage and write it as an integer into a target database (though you’ll be offered to write it as a timestamp).
You access both the source and base data type through the corresponding API.
Nullable conversion, which transforms an empty string originating from data into a null value, refers to the process where a textual value consisting solely of an empty string ""
is replaced with the value null.
As described above, the source data type is converted to a base data type stored in metadata storage. The base type is then converted to the target data type. The following tables show mappings for each base type. The mapping
causes possible information loss (e.g., assigning SMALLINT
to INTEGER
). To minimize this, we also keep track of the data type
size and transfer that if possible. For example, a SMALLINT
column would be stored as base type INTEGER
with size 2
. If the target database supports integer sizes, you will be offered to set the type in the target database as INTEGER(2)
.
Base type STRING
represents any textual type; both CHARACTER VARYING
(or VARCHAR
) and TEXT
types are included.
Also, the string base type is used for any other unrecognized type on input. It means that the
source type column is not an exhaustive list in the following table. It’s a list of suitable string types converted to a string. All
other unknown types are converted to a string as well.
Source | Source Type | Base Type | Target Type | Target |
---|---|---|---|---|
Generic | char | STRING | N/A | |
character varying | ||||
text | ||||
varchar | ||||
STRING | Hive | |||
STRING | Impala | |||
TEXT | MS SQL Server | |||
MySQL | CHAR | VARCHAR | MySQL | |
TEXT | ||||
VARCHAR | ||||
VARCHAR2 | Oracle | |||
VARCHAR | PostgreSQL | |||
Redshift | BPCHAR | VARCHAR | Redshift | |
CHAR | ||||
CHARACTER | ||||
CHARACTER VARYING | ||||
NCHAR | ||||
NVARCHAR | ||||
TEXT | ||||
VARCHAR | ||||
VARCHAR | SiSense | |||
Snowflake | BINARY | VARCHAR | Snowflake | |
CHAR | ||||
CHARACTER | ||||
STRING | ||||
TEXT | ||||
VARBINARY | ||||
VARCHAR | ||||
Synapse | BINARY | NVARCHAR | Synapse | |
CHAR | ||||
NCHAR | ||||
NVARCHAR | ||||
VARBINARY | ||||
VARCHAR | ||||
VARCHAR | Thoughtspot | |||
Source | Source Type | Base Type | Target Type | Target |
The INTEGER
base type represents data types for whole numbers.
Source | Source Type | Base Type | Target Type | Target |
---|---|---|---|---|
Generic | bigint | INTEGER | ||
bigserial | ||||
mediumint | ||||
smallint | ||||
int | ||||
int2 | ||||
int4 | ||||
int64 | ||||
int8 | ||||
integer | ||||
serial8 | ||||
tinyint | ||||
INT | Hive | |||
INT | Impala | |||
BIGINT | MS SQL | |||
MySQL | BIGINT | INTEGER | MySQL | |
INT | ||||
INTEGER | ||||
MEDIUMINT | ||||
SMALLINT | ||||
TINYINT | ||||
N/A | Oracle | |||
INTEGER | Postgres | |||
Redshift | BIGINT | INTEGER | Redshift | |
INT | ||||
INT2 | ||||
INT4 | ||||
INT8 | ||||
INTEGER | ||||
SMALLINT | ||||
BIGINT | SiSense | |||
Snowflake | BIGINT | INTEGER | Snowflake | |
BYTEINT | ||||
INT | ||||
INTEGER | ||||
SMALLINT | ||||
TINYINT | ||||
Synapse | BIGINT | INT | Synapse | |
INT | ||||
SMALLINT | ||||
TINYINT | ||||
INT | Thoughtspot | |||
Source | Source Type | Base Type | Target Type | Target |
The NUMERIC
base type represents fixed-point fractional numbers
(real
, numeric
or decimal
data types).
Source | Source Type | Base Type | Target Type | Target |
---|---|---|---|---|
Generic | dec | NUMERIC | ||
decimal | ||||
fixed | ||||
money | ||||
number | ||||
numeric | ||||
smallmoney | ||||
DECIMAL | Hive | |||
DECIMAL | Impala | |||
DECIMAL | MS SQL Server | |||
MySQL | DEC | NUMERIC | MySQL | |
DECIMAL | ||||
FIXED | ||||
NUMERIC | ||||
NUMBER | Oracle | |||
NUMERIC | PostgreSQL | |||
Redshift | DECIMAL | NUMERIC | Redshift | |
NUMERIC | ||||
DECIMAL | SiSense | |||
Snowflake | DECIMAL | NUMBER | Snowflake | |
NUMBER | ||||
NUMERIC | ||||
Synapse | NUMERIC | NUMERIC | Synapse | |
DECIMAL | ||||
N/A | Thoughtspot | |||
Source | Source Type | Base Type | Target Type | Target |
The FLOAT
base type represents floating-point fractional numbers
(float
or double
data types).
Source | Source Type | Base Type | Target Type | Target |
---|---|---|---|---|
Generic | binary_double | FLOAT | ||
binary_float | ||||
double | ||||
double precision | ||||
d_float | ||||
float | ||||
float4 | ||||
float8 | ||||
quad | ||||
real | ||||
FLOAT | Hive | |||
FLOAT | Impala | |||
FLOAT | MS SQL Server | |||
MySQL | DOUBLE | FLOAT | MySQL | |
DOUBLE PRECISION | ||||
FLOAT | ||||
REAL | ||||
N/A | Oracle | |||
REAL | PostgreSQL | |||
Redshift | DOUBLE PRECISION | FLOAT | Redshift | |
FLOAT | ||||
FLOAT4 | ||||
FLOAT8 | ||||
REAL | ||||
FLOAT | SiSense | |||
Snowflake | FLOAT | DOUBLE | Snowflake | |
DOUBLE PRECISION | ||||
FLOAT | ||||
FLOAT4 | ||||
FLOAT8 | ||||
REAL | ||||
Synapse | FLOAT | FLOAT | Synapse | |
REAL | ||||
FLOAT | Thoughtspot | |||
Source | Source Type | Base Type | Target Type | Target |
The BOOLEAN
base type represents a true or false value.
Source | Source Type | Base Type | Target Type | Target |
---|---|---|---|---|
Generic | bool | BOOLEAN | ||
boolean | ||||
BOOLEAN | Hive | |||
BOOLEAN | Impala | |||
BIT | MS SQL Server | |||
N/A | MySQL | |||
N/A | Oracle | |||
BOOLEAN | PostgreSQL | |||
Redshift | BOOL | BOOLEAN | Redshift | |
BOOLEAN | ||||
BIT | SiSense | |||
Snowflake | BOOLEAN | BOOLEAN | Snowflake | |
Synapse | BIT | BIT | Synapse | |
BOOL | Thoughtspot | |||
Source | Source Type | Base Type | Target Type | Target |
The DATE
base type represents a date value without a time portion.
Source | Source Type | Base Type | Target Type | Target | |
---|---|---|---|---|---|
Generic | date | DATE | DATE | ||
N/A | Hive | ||||
N/A | Impala | ||||
DATE | MS SQL Server | ||||
MySQL | DATE | DATE | MySQL | ||
DATE | Oracle | ||||
DATE | PostgreSQL | ||||
Redshift | DATE | DATE | Redshift | ||
DATE | SiSense | ||||
Snowflake | DATE | DATE | Snowflake | ||
Synapse | DATE | DATE | Synapse | ||
DATE | Thoughtspot | ||||
Source | Source Type | Base Type | Target Type | Target |
The TIMESTAMP
base type represents a date value with a time portion.
Source | Source Type | Base Type | Target Type | Target |
---|---|---|---|---|
Generic | datetime | TIMESTAMP | ||
datetime2 | ||||
datetimeoffset | ||||
smalldatetime | ||||
timestamp | ||||
timestamptz | ||||
timestamp_LTZ | ||||
timestamp_NTZ | ||||
TIMESTAMP_TZ | ||||
timestamp with local time zone | ||||
timestamp with time zone | ||||
timestamp without time zone | ||||
TIMESTAMP | Hive | |||
TIMESTAMP | Impala | |||
DATETIME2 | MS SQL Server | |||
MySQL | DATETIME | TIMESTAMP | MySQL | |
TIMESTAMP | ||||
TIMESTAMP | Oracle | |||
TIMESTAMP | PostgreSQL | |||
Redshift | TIMESTAMP | TIMESTAMP | Redshift | |
TIMESTAMPTZ | ||||
TIMESTAMP WITH TIME ZONE | ||||
TIMESTAMP WITHOUT TIME ZONE | ||||
N/A | SiSense | |||
Snowflake | DATETIME | TIMESTAMP | Snowflake | |
TIMESTAMP | ||||
TIMESTAMP_NTZ | ||||
TIMESTAMP_LTZ | ||||
TIMESTAMP_TZ | ||||
Synapse | DATETIMEOFFSET | DATETIMEOFFSET | Synapse | |
DATETIME | ||||
DATETIME2 | ||||
SMALLDATETIME | ||||
TIME | ||||
TIMESTAMP | Thoughtspot | |||
Source | Source Type | Base Type | Target Type | Target |