Snowflake Columns Mapping
  • 2 Minutes to read
  • Dark
    Light
  • PDF

Snowflake Columns Mapping

  • Dark
    Light
  • PDF

Supported Data Types

Source Datatype Snowflake Datatype Mode Is Split
VARCHAR STRING Nullable x
TEXT STRING Nullable x
INT INTEGER Nullable x
SMALLINT INTEGER Nullable x
BIGINT INTEGER Nullable x
DATE TIMESTAMP Nullable x
TIME TIME Nullable x
CHAR STRING Nullable x
JSON RECORD/VARIANT* Nullable v/x*
ARRAY VARIANT Repeated x
DATETIME TIMESTAMP Nullable x
TINYINT INTEGER Nullable x
BOOLEAN BOOLEAN Nullable x
BIT BOOLEAN Nullable x
REAL FLOAT Nullable x
FLOAT FLOAT Nullable x
NUMBER INTEGER Nullable x
DOUBLE FLOAT Nullable x
OBJECT VARIANT Nullable x
TIMESTAMP_LTZ TIMESTAMP Nullable x
TIMESATMP_NTZ TIMESTAMP Nullable x
TIMESTAMP_TZ TIMESTAMP Nullable x
STRING STRING Nullable x
DECIMAL FLOAT Nullable x

Record Data Type - Splitting into columns per key

In order to make querying of complex and schemaless data more fluent and easy, Rivery tries to split subkeys in JSON/object sources into separated columns. That means every JSON object data, which is not in REPEATED (array) mode, will be converted to a separate column in the target by all of its subkeys. The key names are combined using underscore ( _ ) as a separator.

For example , the JSON object data of

{"user": {"first_name": "Namy", "last_name": "Lasty"}}

Will be split into 2 columns in the target:

  • user_first_name
    * user_last_name

*Variant Data Type

Variant data type in Snowflake gives the ability to query complex JSON objects or arrays, using the Semi-Structured Query Mechanism. In that case, the user may choose to split the data by his own needs.

Array in Rivery will automatically and by default be loaded as variant columns, in order to keep the granularity of the row.

Variant Data Type is used when a JSON/object/Array is combined from schemaless keys, which may change between row to row, or has special chars/blanks in the keys. In these cases, we recommend to retype the whole column as a VARIANT column and query the wanted columns in Snowflake console, or using a Logic river .

Cluster Keys

Defining the table cluster key is supported in Rivery using the Cluster Key checkbox in the columns mapping. Checking this box will also define the order of the complex cluster key, by the number in the checkbox's right side.

snowflake-columns-mapping_mceclip0.png

Expressions

A column may also be a target of expression value. In that case, use the Expression definition on the right side of the mapping table.  When an expression is used, the source column is ignored.

The expression can use other columns in the target section at the mapping table, and any valid expression that can be used in Snowflake SQL clause.

For example, the next expression calculates the table loading date by rivery, and will be targeted under ods_update_date columns:
snowflake-columns-mapping_mceclip1.png

Object (Table) Naming

Rivery supports Snowflake's standard unquoted object identifiers:

  • Start with a letter (A-Z, a-z) or an underscore (“_”).
  • Contain only letters, underscores, decimal digits (0-9), and dollar signs (“$”).
  • Are stored and resolved as uppercase characters (e.g. id is stored and resolved as ID).

Quoted objects are not being supported by Rivery since in that case the identifier is preserved when storing and resolving the identifier (e.g. "id" is stored and resolved as id), and also since these identifiers are not considered as Snowflake's best practice. full Snowflake refrence about the quoted and unquoted objects can be found here.


Was this article helpful?