Useful info - types and layers
Last updated
Last updated
Depending on where the dataset is consumed from data types have slight variations in naming and interpretation.
Numbers up to 38 digits, with an optional precision and scale:
Precision - Total number of digits allowed.
Scale - Number of digits allowed to the right of the decimal point.
By default, precision is 38
and scale is 0
(i.e. NUMBER(38, 0)
). Note that precision limits the range of values that can be inserted into (or cast to) columns of a given type. For example, the value 999
fits into NUMBER(38,0)
but not into NUMBER(2,0)
.
Snowflake uses double-precision (64 bit) IEEE 754 floating-point numbers.
Precision is approximately 15 digits. For example, for integers, the range is from -9007199254740991 to +9007199254740991 (-253 + 1 to +253 - 1). Floating-point values can range from approximately 10-308 to 10+308. (More extreme values between approximately 10-324 and 10-308 can be represented with less precision.)
VARCHAR holds Unicode UTF-8 characters. For convenience reasons, the maximum number of characters has not been specified in the definition of the columns.
BOOLEAN can have TRUE or FALSE values. BOOLEAN can also have an “unknown” value, which is represented by NULL. Boolean columns can be used in expressions (e.g. SELECT list), as well as predicates (e.g. WHERE clause).
The BOOLEAN data type enables support for Ternary Logic.
TIMESTAMP in Snowflake is a user-specified alias associated with one of the TIMESTAMP_* variations. In all operations where TIMESTAMP is used, the associated TIMESTAMP_* variation is automatically used. All the tables are using TIMESTAMP_NTZ. TIMESTAMP_NTZ internally stores “wallclock” time with a specified precision. All operations are performed without taking any time zone into account. All timestamps in the dataset are using UTC times.
For JSON representation, our datasets are using VARIANT data type (it can contain any other data type). A VARIANT can store a value of any other type, including OBJECT and ARRAY.The maximum length of a VARIANT is 16 MB.
RAW
Data that has been sourced from the node with minimal transformations
DECODED
Data that has been transformed via any kind of decoding process
TECH
Data that has been created by internal ETL processes to simplify querying (this data is defined internally by Token Flow and has not meaning outside of the data set)