The easiest way to build an assembly is to use the assembly wizard. The following describes how a custom Assembly can be built manually using schemas, databases, pipelines and streams.
Overview of Assembly view with options to name, add schema, add database, add pipelines, add streams and set labels for an Assembly.
Name the Assembly to create, e.g.
Define a schema into which data will be converted from its import format, to a format compatible with KX Insights Platform - and add to the Assembly.
Configure the database to store and access our data, and add to the Assembly.
Build a stream to push data to the database, and add to the Assembly.
Assign label name(s) and associated value(s). Labels allow users query data stored on multiple Assemblies with a shared label name. By default, a database label name matching the name given to the Assembly is assigned. An assembly must have at least one label before it can be saved and deployed.
Save the Assembly.
Create a pipeline to get data from source to the database. An assembly does not require a pipeline to be deployed. However, an assembly does need to be saved for its database to be available when building pipelines.
If an Assembly is using too many resources; e.g. CPU and/or memory, you can tear it down to free up those resources.
Assembly name must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character.
Sample Expression Assembly
An Assembly to build a table generated from a kdb expression. The table will have two columns; one for time and a second of random numbers.
Open a schema document.
Set the table name to
exptableand the schema name to
Our table will have two columns which will have the following properties:
Timestamp data column required for each table in a Schema.
This simple table has only two columns, but all tables in a schema must have a timestamp column for data to be partitioned inside KX Insights Platform.
Table Propertiesremove the primary key and set the following properties:
Primary Keys: (blank) Description: (blank) Type*: partitioned Partiton column: date Block size*: (blank) Partitions*: (blank) Timestamp column: date Real-time Sort: date Interval Sort: date Historical Sort: date
Block size, and
Partitions are Advanced properties.
Schema properties required to define partitioning of your data; focus on defining Partition Type, Partition Column, Timestamp Column, Real-time Sort, Interval Sort and Historical Sort. Typically, partitioning is defined by the required timestamp column in your data.
- Click Submit.
Code view allows schemas to be defined using a code editor:
For this, we will be building a Time Series database, consisting of tiers to access and store real-time, interval and historic data.
Create a database by clicking [+] next to Database name in the entity tree, and select
Name the database
expdatabaseand Submit - this will make available additional configuration options.
Advancedmode, update the following sections - tab between each section for details:
No change required.
- mount name: rdb
- source: expdatabase
Update Database with the Advanced option enabled; the majority of properties can be left unchanged, but do define the rdb source as
- source: expdatabase
Update Storage scrolled near the end of the property definitions; set source to
expdatabase and leave other options unchanged.
Other database parameters, e.g. Mount, use pre-configured values.
Create a stream to push data for our database.
stream name: expdatabase sub topic: external facing: false
At least one label is required before an Assembly can be saved.
Only saved Assemblies and their associated schema tables are available to write data too when creating pipelines. Click to save the Assembly.
Next, add an expression pipeline to the assembly before deploying. This pipeline will ingest data generated by a kdb expression and write it to our assembly database.
Open a Pipeline workspace, this can be done by clicking [+] next to Pipeline Templates in the left-hand entity tree.
Click-and-drag into the workspace an Expression Reader. Add the following to the editor, then Apply.
( time:200?(reverse .z.p-1+til 10); cnt:200?10)
- Click-and-drag into the workspace the Apply Schema Transform node. Set the Data Format from the dropdown, and then click to load the schema and table:
data format: Table schema: expschema table: exptable
Parsebox of the Apply Schema Transform node for the Timestamp
Click-and-drag into the pipeline workspace KX Insights Database Writer and define the table to store the results:
database: expdatabase table: exptable deduplicate stream: true
Configure KX Insights Database Node by defining the database name, table and enable deduplicate stream.
Apply the properties.
Connect the nodes, Save the pipeline as
Saving a pipeline requires defining the Pipeline name.
A finished pipeline with a sequence of nodes, here including a writer data ingest node, a transformation node to define the table schema, and a writer node to send data to our database.
With all components created, the assembly should look like this:
Name: expassembly Schema Configuration: expschema Database Configuration: expdatabase Pipelines: exppipeline Streams: expdatabase
The assembly should change from INACTIVE state (grey circle) to ACTIVE state (green circle with a tick).
A completed assembly can be saved, then deployed; a successfully deployed and active assembly will show a green circle with a tick inside it.
Once the assembly has entered the active state (a green solid circle with a tick) open a Query document; this can be done by clicking [+] next to Queries in the left-hand entity tree or from the Document bar along the top.
Open the Q tab of Query.
From the Assembly dropdown menu, select
expassembly, and choose
Query your data with:
SELECT * FROM exptable
Output variable; this can be exptable.
Click Get Data.
Results will appear in the Console.
When you deploy an assembly, it consumes memory and CPU resources. Likewise, any data written to an assembly database consumes memory and CPU resources too. To return these resources, tear down the assembly from either the tab menu with the
Teardown button or from the assembly entity-tree menu:
Teardown a deployed assembly to return used memory and CPU resources.
Assemblies deployed via the command line
You can also deploy assemblies using the command line. These are listed under
Assemblies in the entity-tree. You cannot interact with them in the UI. They can only be managed (torndown) using the command line.
Before starting the teardown process is an option to
Clean up resources after tearing down. If checked, this process will also remove any data stored in the assembly. If left unchecked, any data written to the assembly database will remain available.
When selecting Teardown Assembly, an option to
Clean up resources after tearing down is available. When checked, any data associated with the assembly will be deleted too.
Should an assembly fail to deploy, check diagnostics for reported errors. Ensure your assembly has at least a database, schema and one or more streams. An assembly does not require a pipeline to be deployed.