- 2 Minutes to read
- Print
- DarkLight
- PDF
Snowflake Columns Mapping
- 2 Minutes to read
- Print
- DarkLight
- PDF
Type Mapping
We match Snowflake data types to Rivery data types while extracting your data. If a data type is not supported, we will map it as a String type.
The following table shows the mapping of Snowflake data types to Rivery-compatible types:
Snowflake Type | Rivery Type |
---|---|
VARCHAR | STRING |
INT | NUMBER |
BIGINT | NUMBER |
SMALLINT | NUMBER |
TIMESTAMP | TIMESTAMP |
DATE | TIMESTAMP |
INTEGER | NUMBER |
NUMBER | NUMBER |
DATETIME | TIMESTAMP |
SMALLINT | smallint |
BOOLEAN | BOOLEAN |
FLOAT | FLOAT |
BOOL | BOOLEAN |
ARRAY | STRING |
OBJECT | VARIANT |
TIME | TIME |
TIMESTAMP_LTZ | TIMESTAMP |
TIMESTAMP_NTZ | TIMESTAMP |
TIMESTAMP_TZ | TIMESTAMP |
STRING | STRING |
DECIMAL | NUMBER |
TEXT | STRING |
NUMERIC | FLOAT |
DOUBLE | FLOAT |
DOUBLE PRECISION | FLOAT |
REAL | FLOAT |
RECORD | VARIANT |
VARIANT | VARIANT |
FIXED | NUMBER |
Record Data Type - Splitting into columns per key
In order to make querying of complex and schemaless data more fluent and easy, Rivery tries to split subkeys in JSON/object sources into separated columns. That means every JSON object data, which is not in REPEATED (array) mode, will be converted to a separate column in the target by all of its subkeys. The key names are combined using underscore ( _ ) as a separator.
For example , the JSON object data of
{"user": {"first_name": "Namy", "last_name": "Lasty"}}
Will be split into 2 columns in the target:
- user_first_name
* user_last_name
*Variant Data Type
Variant data type in Snowflake gives the ability to query complex JSON objects or arrays, using the Semi-Structured Query Mechanism. In that case, the user may choose to split the data by his own needs.
Array in Rivery will automatically and by default be loaded as variant columns, in order to keep the granularity of the row.
Variant Data Type is used when a JSON/object/Array is combined from schemaless keys, which may change between row to row, or has special chars/blanks in the keys. In these cases, we recommend to retype the whole column as a VARIANT column and query the wanted columns in Snowflake console, or using a Logic river .
Cluster Keys
Defining the table cluster key is supported in Rivery using the Cluster Key checkbox in the columns mapping. Checking this box will also define the order of the complex cluster key, by the number in the checkbox's right side.
Expressions
A column may also be a target of expression value. In that case, use the Expression definition on the right side of the mapping table. When an expression is used, the source column is ignored.
The expression can use other columns in the target section at the mapping table, and any valid expression that can be used in Snowflake SQL clause.
For example, the next expression calculates the table loading date by rivery, and will be targeted under ods_update_date columns:
Object (Table) Naming
Rivery supports Snowflake's standard unquoted object identifiers:
- Start with a letter (A-Z, a-z) or an underscore (“_”).
- Contain only letters, underscores, decimal digits (0-9), and dollar signs (“$”).
- Are stored and resolved as uppercase characters (e.g. id is stored and resolved as ID).
Quoted objects are not being supported by Rivery since in that case the identifier is preserved when storing and resolving the identifier (e.g. "id" is stored and resolved as id), and also since these identifiers are not considered as Snowflake's best practice. full Snowflake refrence about the quoted and unquoted objects can be found here.