Skip to content

Assembly

The easiest way to build an assembly is to use the assembly wizard. The following describes how a custom Assembly can be built manually using schemas, databases, pipelines and streams.

Overview of Assembly view with options to name, add schema, add database, add pipelines, add streams and set labels for an Assembly.
Overview of Assembly view with options to name, add schema, add database, add pipelines, add streams and set labels for an Assembly.

Set up

  1. Name the Assembly to create, e.g. expassembly.

  2. Define a schema into which data will be converted from its import format, to a format compatible with KX Insights Platform - and add to the Assembly.

  3. Configure the database to store and access our data, and add to the Assembly.

  4. Build a stream to push data to the database, and add to the Assembly.

  5. Assign label name(s) and associated value(s). Labels allow users query data stored on multiple Assemblies with a shared label name. By default, a database label name matching the name given to the Assembly is assigned. An assembly must have at least one label before it can be saved and deployed.

  6. Save the Assembly.

  7. Create a pipeline to get data from source to the database. An assembly does not require a pipeline to be deployed. However, an assembly does need to be saved for its database to be available when building pipelines.

  8. If an Assembly is using too many resources; e.g. CPU and/or memory, you can tear it down to free up those resources.

Assembly Name

Assembly name must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character.

Sample Expression Assembly

An Assembly to build a table generated from a kdb expression. The table will have two columns; one for time and a second of random numbers.

Schema

  • Open a schema document.

  • Set the table name to exptable and the schema name to expschema.

  • Our table will have two columns which will have the following properties:

Name* Type*
date Timestamp
cnt Integer

Timestamp data column required for each table in a Schema.

This simple table has only two columns, but all tables in a schema must have a timestamp column for data to be partitioned inside KX Insights Platform.

  • For Table Properties remove the primary key and set the following properties:
Primary Keys: (blank)
Description: (blank)
Type*: partitioned
Partiton column: date
Block size*: (blank)
Partitions*: (blank)
Timestamp column: date
Real-time Sort: date
Interval Sort: date
Historical Sort: date

Advanced properties

Type, Block size, and Partitions are Advanced properties.

Schema properties required to define partitioning of your data; focus on defining Partition Type, Partition Column, Timestamp Column, Real-time Sort, Interval Sort and Historical Sort. Typically, partitioning is defined by the required timestamp column in your data.
Schema properties required to define partitioning of your data; focus on defining Partition Type, Partition Column, Timestamp Column, Real-time Sort, Interval Sort and Historical Sort. Typically, partitioning is defined by the required timestamp column in your data.

  • Click Submit.

Code View

Code view allows schemas to be defined using a code editor:

Code view uses JSON to define schema properties; useful for data tables with large numbers of columns that would be difficult to define directly in the UI

Database

For this, we will be building a Time Series database, consisting of tiers to access and store real-time, interval and historic data.

  • Create a database by clicking [+] next to Database name in the entity tree, and select Time Series.

  • Name the database expdatabase and Submit - this will make available additional configuration options.

  • In Advanced mode, update the following sections - tab between each section for details:

No change required.

  • mount name: rdb
  • source: expdatabase

Update Database with the Advanced option enabled; the majority of properties can be left unchanged, but do define the rdb source as expdatabase.
Update Database with the Advanced option enabled; the majority of properties can be left unchanged, but do define the rdb source as expdatabase.

  • source: expdatabase

Update Storage scrolled near the end of the property definitions; set source to expdatabase and leave other options unchanged.
Update Storage scrolled near the end of the property definitions; set source to expdatabase and leave other options unchanged.

  • Other database parameters, e.g. Mount, use pre-configured values.

  • Submit the expdatabase database.

Stream

Create a stream to push data for our database.

stream name: expdatabase
sub topic:
external facing: false

Labels

At least one label is required before an Assembly can be saved.

label value
databasename expdatabase

Save

Only saved Assemblies and their associated schema tables are available to write data too when creating pipelines. Click save button to save the Assembly.

Pipeline

Next, add an expression pipeline to the assembly before deploying. This pipeline will ingest data generated by a kdb expression and write it to our assembly database.

  • Open a Pipeline workspace, this can be done by clicking [+] next to Pipeline Templates in the left-hand entity tree.

  • Click-and-drag into the workspace an Expression Reader. Add the following to the editor, then Apply.

([] time:200?(reverse .z.p-1+til 10); cnt:200?10)
  • Click-and-drag into the workspace the Apply Schema Transform node. Set the Data Format from the dropdown, and then click add schema button to load the schema and table:
data format: Table
schema: expschema
table: exptable
  • Check the Parse box of the Apply Schema Transform node for the Timestamp date column.

  • Click-and-drag into the pipeline workspace KX Insights Database Writer and define the table to store the results:

database: expdatabase
table: exptable
deduplicate stream: true

Configure KX Insights Database Node by defining the database name, table and enable deduplicate stream.
Configure KX Insights Database Node by defining the database name, table and enable deduplicate stream.

  • Apply the properties.

  • Connect the nodes, Save the pipeline as exppipeline.

Saving a pipeline requires defining the Pipeline name.
Saving a pipeline requires defining the Pipeline name.

A finished pipeline with a sequence of nodes, here including a writer data ingest node, a transformation node to define the table schema, and a writer node to send data to our database.
A finished pipeline with a sequence of nodes, here including a writer data ingest node, a transformation node to define the table schema, and a writer node to send data to our database.

Configured Assembly

With all components created, the assembly should look like this:

Name: expassembly
Schema Configuration: expschema
Database Configuration: expdatabase
Pipelines: exppipeline
Streams: expdatabase
  • Save.
  • Deploy.

The assembly should change from INACTIVE state (grey circle) to ACTIVE state (green circle with a tick).

A completed assembly can be saved, then deployed; a successfully deployed and active assembly will show a green circle with a tick inside it.
A completed assembly can be saved, then deployed; a successfully deployed and active assembly will show a green circle with a tick inside it.

Query Data

  • Once the assembly has entered the active state (a green solid circle with a tick) open a Query document; this can be done by clicking [+] next to Queries in the left-hand entity tree or from the Document bar along the top.

  • Open the Q tab of Query.

  • From the Assembly dropdown menu, select expassembly, and choose hdb.

  • Query your data with:

SELECT * FROM exptable

exptable

  • Define the Output variable; this can be exptable.

  • Click Get Data.

  • Results will appear in the Console.

Teardown

When you deploy an assembly, it consumes memory and CPU resources. Likewise, any data written to an assembly database consumes memory and CPU resources too. To return these resources, tear down the assembly from either the tab menu with the Teardown button or from the assembly entity-tree menu:

Teardown a deployed assembly to return used memory and CPU resources.
Teardown a deployed assembly to return used memory and CPU resources.

Assemblies deployed via the command line

You can also deploy assemblies using the command line. These are listed under Assemblies in the entity-tree. You cannot interact with them in the UI. They can only be managed (torndown) using the command line.

Before starting the teardown process is an option to Clean up resources after tearing down. If checked, this process will also remove any data stored in the assembly. If left unchecked, any data written to the assembly database will remain available.

When selecting Teardown Assembly, an option to Clean up resources after tearing down is available. When checked, any data associated with the assembly will be deleted too.
When selecting Teardown Assembly, an option to Clean up resources after tearing down is available. When checked, any data associated with the assembly will be deleted too.

Troubleshooting

Should an assembly fail to deploy, check diagnostics for reported errors. Ensure your assembly has at least a database, schema and one or more streams. An assembly does not require a pipeline to be deployed.