Skip to content

Database

A database stores data and empowers users to run query data for insights. KX Insights Platform databases allow users to store and access data from tiered data access using faster access mechanisms for the most recent data as may be used in real-time, and slower, cheaper access for older, historic data. The secret to the speed of the KX Insights database is its ability to run user driven analytics directly on the data - leveraging the timeseries, columnar performance capabilities of kdb+.

Quickstart

A database can be created by clicking [+] next to Database name in the entity tree.

Click the plus icon next to the Database name in the entity tree to create an individual database.
Click the plus icon next to the Database name in the entity tree to create an individual database.

KX Insights databases are designed to scale independently for data write-down and querying. Depending on the intended workloads, compute settings can be tuned for each component. If your workload has a large amount of data being ingested, consider increasing the Writedown compute settings. If your workload has high frequency, or complex queries, consider increasing the Query Setting compute settings.

Once the database is created, it must be part of an assembly and be deployed to be activated. An active database is used to store data and is a requirement for working with data on the platform outside of testing. An active assembly will show a green circle with a tick next to its name in the left-hand entity tree listed under Assemblies.

Set up

  1. Click the [+] icon from the entity tree to create a Database.

  2. Select between Time-series or Custom database; it's recommended users start with a Time-series database.

  3. Define database name; Submit.

  4. Custom databases have tabbed options to define Mounts, Data Access and Storage properties. When creating a Custom database a Source name for Storage is required; this is pre-assigned as south for a Time-series database.

  5. Submit.

  6. To use the database, open up an assembly listed under Assemblies in the entity tree and add the database.

  7. Save and deploy the assembly.

Beta Database Builder

An updated version for configuring your databases is now available; see the new beta documentation for more information.

Quickstart - build an assembly with a database

Time-series

A quick start option for creating a database with predefined RDB, IDB and HDB mounts, and recommended for new users to KX Insights Platform. Required is a Database Name, and has a predefined Storage streaming data source, but other changes are optional. Advanced properties are available from the slider.

A time series database featuring customizable properties for Mounts, Data Access and Storage.  A time series database can be used 'out-of-the-box' without changes to default settings.
A time series database featuring customizable properties for Mounts, Data Access and Storage. A time series database can be used 'out-of-the-box' without changes to default settings.

Database Name Name of database.

Time series databases come with predefined RDB, IDB and HDB mounts. Additional mounts can be added but are not required. Advanced options are marked with an * in the table item list.

item description
Name Name of mount, e.g. rdb.
Type Select from stream or local; defaults to stream for rdb, and local for idb and hdb.
Partition Partition scheme for the mount, e.g. none, ordinal, date, month, year; defaults to none for an rdb, ordinal for idb and date for hdb.
Dependencies List of mounts required to be mounted as a dependency of this mount; defaults to idb for rdb and hdb mounts, but is undefined for an idb.
BaseURI* Uniform Resource Identifier (URI) representing where data can be mounted by other services; set to none for an rdb, and file:///data/db for the respective idb and hdb.
Description* A text description of the purpose of the mount.
Volume Size Define the disk size limit for how much data the database can store; defaults to no value set.

Define how data will be accessed from the database. Check Enable Query Environment to allow data querying for the database from the Query tabs in the User Interface. Advanced options are marked with an * in the tables.

item description
Replicas The number of query environment replicas; default to 1

For time series database, RDB, IDB and HDB instances are created by default, but additional instances can be added if required. Each instance offers:

item description
Mount Name Name of data access mount, e.g. rdb.
Stream Logs volume Set requested PV disk size, e.g. 20Gi, but no value is set by default.
Replicas* Size of the Data Access deployment; defaults to 1
Source Sequence Bus to subscribe to; defaults to south

Options to define CPU and Memory Size required to handle data access during querying.

CPU of the tier

item description
Minimum CPU Minimum CPU required - 1 is for one core, 0.5 is half time for one core - must be greater than 0.1; defaults to 0.1.
Maximum CPU Maximum CPU available and must be less than 12. Process is throttled when maximum CPU usage is reached; defaults to 1.

Memory Size

item description
Minimum Memory Minimum memory to allocate to the database writer and always available, must be greater than 1MB; defaults to 128 MB
Maximum Memory Maximum memory to allocate to the database writer; once reached, an out-of-memory error will return, must be less than 20,480MB; defaults to 512 MB.

Environment Variables

Advanced only and adding of environment variables is optionable.

item description
variable value

Required is a definition of the source for database storage. The source is either an URI directed at a static data source, or is the name of an entity for streaming data. For a time series database, the source is preconfigured for streaming data and is named south. Advanced options are marked with an * in the tables.

The storage source is required for a time series database defaults to south.
The storage source is required for a time series database defaults to south.

item description
Enforce Schema* Check (enable) to enforce table schema when persisting - used for debugging, may incur performance penalty; default is off.
EOD Peach Level* Level at which EOD peaches to parallelize HDB table processing; select between part or table. No default set.
Replicas (Beta)* Number of storage replica deployments; defaults to 1.
Sort Limit GB* Memory limit when sorting splayed tables or partitions on disk in GB. No default set.
Source (required) Either a URI pointed at a static data source of the name of an entry in bus from which to obtain streaming data. Defaults to south.

Tiers

Data is stored by tiers; when certain storage conditions are met, data rolls into another tier. For a time series database there are preconfigured tiers for RDB, IDB and HDB, but users can add HDB tiers if required.

item description
Mount The name of the corresponding mount at which data in this tier can be accessed, e.g. rdb.
Name Reference to a particular tier; defaults to streaming for rdb, interval for idb and historical for hdb

Compression (Advanced only)

Define how stored data is compressed.

item description
Algorithm Define how the data is to be compressed; set to default.
Block Define block size for compression; numeric value.
Level Set compression level; numeric value.

Retain and Schedule

Toggle the dropdown for additional Tier options. Retain determines how much data to store in the current tier before rolling into the next tier, and schedule determines when the rollover should occur.

item description
Rows* Rollover occurs for data beyond the number of rows set by this variable.
Size* Threshold of tier size before rollover occurs. Number in byte size followed by unit, e.g. 2 TB.
Size Pct* Threshold of tier size before rollover occurs, expressed as a percentage of total storage for the corresponding mount; value between 1 and 100.
Time A timespan consisting of a number followed by a time unit (Seconds, Minutes, Hours, Days, Weeks, Months or Years); e.g. 2 Years. Rollover occurs for data whch has been stored for this length of time; defaults to no value.
Frequency How often ahould the tier roll data into the next tier; defaults to no value for an rdb, 0D00:10:00 for an idb, 1D00:00:00 for an hdb. Set using a 24h clock. In order to rollover data from a hdb, it's necessary to create an hdb sub-tier; click Add Tier.
Snap At what whole multiples of time should rollovers be scheduled; defaults to 00:00:00.

CPU

Allocated CPU resources for the database writer.

item description
Minimum CPU Minimum CPU required - 1 is for one core, 0.5 is half time for one core - must be greater than 0.1; defaults to 0.1.
Maximum CPU Maximum CPU available and must be less than 12. Process is throttled when maximum CPU usage is reached; defaults to 1.

Memory Size

Allocated memory resouces for the database writer.

item description
Minimum Memory Minimum memory to allocate to the database writer and always available, must be greater than 1MB; defaults to 128 MB
Maximum Memory Maximum memory to allocate to the database writer; once reached, an out-of-memory error will return, must be less than 20,480MB; defaults to 512 MB.

Stream Logs Volume

Define the disk size for requested logs. Defaults to 20Gi.

Environment Variables (Advanced only)

Adding of environment variables is optionable.

item description
variable value

Custom

A standard database with a single RDB mount, additional mounts can be added as required. In addition to giving the database a name there is a requirement to define a Storage Source to create the database. Advanced properties are available from the slider.

Define the 'Source' for the storage of a custom database is required.
Define the 'Source' for the storage of a custom database is required.

Database Name
Name of database.

A custom database comes with just an RDB mount, but additional IDB and HDB mounts can be added if required. Advanced options are marked with an * in the table item list.

item description
Name Name of mount, e.g. rdb.
Type Select from stream or local; defaults to stream for thr rdb.
Partition Partition scheme for the mount, e.g. none, ordinal, date, month, year; defaults to none for an rdb.
Dependencies List of mounts required to be mounted as a dependency of this mount; defaults to idb for rdb.
BaseURI* Uniform Resource Identifier (URI) representing where data can be mounted by other services; set to none for an rdb.
Description* A text description of the purpose of the mount.
Volume Size Define the disk size limit for how much data the database can store; defaults to no value set.

Define how data will be accessed from the database. Check Enable Query Environment to allow data querying for the database, enabled by default. Advanced options are marked with an * in the tables.

item description
Replicas The number of query environment replicas; default to 1

A single RDB instance is provided, additional IDB and HDB instances can be added to the database if required. Each instance offers:

item description
Mount Name Name of data access mount, e.g. rdb.
Stream Logs volume Set requested PV disk size, e.g. 20Gi, but no value is assigned by default.
Replicas* Size of the Data Access deployment; defaults to 1
Source Sequence Bus to subscribe to. Not set by default and not a requirement for the database, but can share the name of the corresponding mount Source used by Storage.

Options to define CPU and Memory Size required to handle data access during querying.

CPU of the tier

item description
Minimum CPU Minimum CPU required - 1 is for one core, 0.5 is half time for one core - must be greater than 0.1; defaults to 0.1.
Maximum CPU Maximum CPU available and must be less than 12. Process is throttled when maximum CPU usage is reached; defaults to 1.

Memory Size

item description
Minimum Memory Minimum memory to allocate to the database writer and always available, must be greater than 1MB; defaults to 128 MB
Maximum Memory Maximum memory to allocate to the database writer; once reached, an out-of-memory error will return, must be less than 20,480MB; defaults to 512 MB.

Environment Variables

Advanced only and adding of environment variables is optionable.

item description
variable value

Required is a definition of the source for database storage. The source is either an URI directed at a static data source, or is the name of an entity for streaming data. Advanced options are marked with an * in the tables.

The storage source is required for a time series database defaults to south.
The storage source is required for a time series database defaults to south.

item description
Enforce Schema* Check (enable) to enforce table schema when persisting - used for debugging, may incur performance penalty; default is off.
EOD Peach Level* Level at which EOD peaches to parallelize HDB table processing; select between part or table. No default set.
Replicas (Beta)* Number of storage replica deployments; defaults to 1.
Sort Limit GB* Memory limit when sorting splayed tables or partitions on disk in GB. No default set.
Source (required) Either a URI pointed at a static data source of the name of an entry in bus from which to obtain streaming data. No default is set

Tiers

Data is stored by tiers; when certain storage conditions are met, data rolls into another tier. A custom database comes with a single RBD tier, but addition IDB and/or HDB tiers can be added if required.

item description
Mount The name of the corresponding mount at which data in this tier can be accessed, e.g. rdb.
Name Reference to a particular tier; defaults to tier1 for the provided rdb mount.

Retain and Schedule

Toggle the dropdown for additional Tier options. Retain determines how much data to store in the current tier before rolling into the next tier, and schedule determines when the rollover should occur.

item description
Retain A timespan consisting of a number followed by a time unit (Seconds, Minutes, Hours, Days, Weeks, Months or Years); e.g. 2 Years. Rollover occurs for data whch has been stored for this length of time; defaults to no value.
Frequency How often ahould the tier roll data into the next tier; defaults to no value for an rdb, 0D00:10:00 for an idb, 1D00:00:00 for an hdb. Set using a 24h clock. In order to rollover data from a hdb, it's necessary to create an hdb sub-tier; click Add Tier.
Snap At what whole multiples of time should rollovers be scheduled; defaults to 00:00:00.

CPU

Allocated CPU resources for the database writer.

item description
Minimum CPU Minimum CPU required - 1 is for one core, 0.5 is half time for one core - must be greater than 0.1; defaults to 0.1.
Maximum CPU Maximum CPU available and must be less than 12. Process is throttled when maximum CPU usage is reached; defaults to 1.

Memory Size

Allocated memory resouces for the database writer.

item description
Minimum Memory Minimum memory to allocate to the database writer and always available, must be greater than 1MB; defaults to 128 MB
Maximum Memory Maximum memory to allocate to the database writer; once reached, an out-of-memory error will return, must be less than 20,480MB; defaults to 512 MB.

Stream Logs Volume

Define the disk size for requested logs. Defaults to 20Gi

Environment Variables (Advanced only)

Adding of environment variables is optionable.

item description
variable value