Snowflake Columns Mapping
  • 2 Minutes to read
  • Dark
    Light
  • PDF

Snowflake Columns Mapping

  • Dark
    Light
  • PDF

Article Summary

Type Mapping

We match Snowflake data types to Rivery data types while extracting your data. If a data type is not supported, we will map it as a String type.

The following table shows the mapping of Snowflake data types to Rivery-compatible types:

Snowflake TypeRivery Type
VARCHARSTRING
INTNUMBER
BIGINTNUMBER
SMALLINTNUMBER
TIMESTAMPTIMESTAMP
DATETIMESTAMP
INTEGERNUMBER
NUMBERNUMBER
DATETIMETIMESTAMP
SMALLINTsmallint
BOOLEANBOOLEAN
FLOATFLOAT
BOOLBOOLEAN
ARRAYSTRING
OBJECTVARIANT
TIMETIME
TIMESTAMP_LTZTIMESTAMP
TIMESTAMP_NTZTIMESTAMP
TIMESTAMP_TZTIMESTAMP
STRINGSTRING
DECIMALNUMBER
TEXTSTRING
NUMERICFLOAT
DOUBLEFLOAT
DOUBLE PRECISIONFLOAT
REALFLOAT
RECORDVARIANT
VARIANTVARIANT
FIXEDNUMBER

Record Data Type - Splitting into columns per key

In order to make querying of complex and schemaless data more fluent and easy, Rivery tries to split subkeys in JSON/object sources into separated columns. That means every JSON object data, which is not in REPEATED (array) mode, will be converted to a separate column in the target by all of its subkeys. The key names are combined using underscore ( _ ) as a separator.

For example , the JSON object data of

{"user": {"first_name": "Namy", "last_name": "Lasty"}}

Will be split into 2 columns in the target:

  • user_first_name
    * user_last_name

*Variant Data Type

Variant data type in Snowflake gives the ability to query complex JSON objects or arrays, using the Semi-Structured Query Mechanism. In that case, the user may choose to split the data by his own needs.

Array in Rivery will automatically and by default be loaded as variant columns, in order to keep the granularity of the row.

Variant Data Type is used when a JSON/object/Array is combined from schemaless keys, which may change between row to row, or has special chars/blanks in the keys. In these cases, we recommend to retype the whole column as a VARIANT column and query the wanted columns in Snowflake console, or using a Logic river .

Cluster Keys

Defining the table cluster key is supported in Rivery using the Cluster Key checkbox in the columns mapping. Checking this box will also define the order of the complex cluster key, by the number in the checkbox's right side.

snowflake-columns-mapping_mceclip0.png

Expressions

A column may also be a target of expression value. In that case, use the Expression definition on the right side of the mapping table.  When an expression is used, the source column is ignored.

The expression can use other columns in the target section at the mapping table, and any valid expression that can be used in Snowflake SQL clause.

For example, the next expression calculates the table loading date by rivery, and will be targeted under ods_update_date columns:
snowflake-columns-mapping_mceclip1.png

Object (Table) Naming

Rivery supports Snowflake's standard unquoted object identifiers:

  • Start with a letter (A-Z, a-z) or an underscore (“_”).
  • Contain only letters, underscores, decimal digits (0-9), and dollar signs (“$”).
  • Are stored and resolved as uppercase characters (e.g. id is stored and resolved as ID).

Quoted objects are not being supported by Rivery since in that case the identifier is preserved when storing and resolving the identifier (e.g. "id" is stored and resolved as id), and also since these identifiers are not considered as Snowflake's best practice. full Snowflake refrence about the quoted and unquoted objects can be found here.


Was this article helpful?