A schema transforms ingested data to a format compatible with kdb+ datatypes of the destination database on kdb Insights Enterprise. Schemas are used in pipelines and during data import, and are associated with Assemblies. Schemas can be created independently or as part of the data assembly wizard builder. Free Trial users already have table schemas defined for the sample data sets used by
insights-demo; simply deploy the
insights-demo assembly to make use of them.
A schema is made up of one or more tables. The table schema of kdb Insights Enterprise has a defined structure, but each table in a schema must have at least one data column and a
timestamp column for data partitioning.
A schema consists of a table name, a list of data columns of defined kdb+ types, and a defined partitioned timestamp column
Tables can be added or removed from a schema, but schemas can only be changed when the associated Assembly is torn down (from the Overview page or right-click the assembly in the entity-tree) and associated resources generated by the deployed assembly are cleaned (this will be prompted during Teardown).
Click the [+] next to the Schema option in the left-hand entity tree.
Name the schema
Start by renaming the table from the default
table1, or add a new table with (+).
To the table add data columns corresponding to ingested data; define the name and kdb type for each column of data to be converted; a column description is optional. A
Timestampdata column must be included in the table. Some of the more common kdb types for ingested data include:
item column type time column Date, Datetime, Time, Timestamp category (independent data) Symbol, String numeric (dependent data) Float, Integer
Define Table properties; i.e. the
Timestampcolumn from the data. Optional is to set the data sort. This can be done for interval, real-time and historical data; typically a sort is done using the aforementioned
Code view allows users to define schemas using JSON. Schemas built using the code editor must conform with the requirements outlined in the set up.
A JSON schema code editor view
The code editor will do a validation check, preventing submission of invalid code. Details of error(s) will appear on a hover-over of the underlined problem in the code. In addition, when editing a property, a pop-up of supported attributes will be offered:
Suggested JSON attributes for selected schema property.
Give the schema a name before adding data columns.
|Schema Name||Name the schema to be created|
|Advanced||Toggle additional properties for the schema if required. Note, all schemas are of type
|Code View||Opens a JSON editor for an alternative method to build table schemas.|
|Submit||Submit a completed schema|
Give the table a name, this will appear in the tab menu; avoid using reserved keywords for SQL or q ; e.g.
table as this will return an error when queried. To add a new table, click [+].
Details of the first column can be edited inside the table, additional columns can be added with the button.
Schemas are a collection of separate tables; add a table by clicking the plus sign, then give it a name
To remove a column, click on the and select delete. From the latter menu it's also possible to duplicate a column.
Columns can be repositioned on click-and-drag.
Select the expander for additional properties:
On-Disk Attributes and
Foreign. Expanded properties are marked by a
* in the table item list below:
Sample columns to add to a schema table and suggested Types for each of the common data types.
|Name||Assign a name to the column in the table.|
|Type||Select the kdb+ datatype conversion for the data column;
|Description||Optional description of the column purpose.|
|Primary Key*||Check if a Primary Key column.|
|On-Disk Attribute (Ordinal Partitioning)*||Column attribute when stored on disk. Select from
|On-Disk Attribute (Temporal Partitioning)*||Column attribute when stored on disk with an ordinal partition scheme. Select from
|Foreign*||A foreign key into another table of the assembly, of the form
All Tables Require a Timestamp Partition Column
Each table of a schema have a time data column. In addition, the table is typically partitioned and sorted (interval, historic and/or real-time) by this time column. See table properties
Avoid using reserved keywords for SQL or q when naming tables as this can return an error.
Every table in a schema has a set of properties. Every table of a schema must have a column defined for the
Timestamp Column and the
Partition Column. Advanced schema properties are marked by an
* in the table item list.
Table properties which includes a partition column that must be a timestamp, a defined timestamp column, and optional sort columns for each RDB, IDB and HDB tier.
|Primary Keys||A list of primary key column names.|
|Description||Description of the table purpose.|
|Partition Column||Select from dropdown options; defaults to
|Partitions*||Set number of partitions; an integer|
|Block Size*||Define number of records to keep in memory before writing to disk; an integer|
|Timestamp Column (required)||Select from dropdown options; defaults to
|Real-time Sort||Real-time (rdb) data sort column; defaults to
|Interval Sort||Interval (idb) data sort column; defaults to
|Historical Sort||Historical (hdb) data sort column; defaults to