Assembly configuration comes in 2 flavors: the first is the format as consumed by KXI components. The second is a version embedded in a Kubernetes custom resource document (CRD) and contains KXI-Operator specific keys.
This documents describes the Assembly Configuration in the format consumed by KXI components. Information about the assembly CRD can be found here.
An assembly configuration is a machine-readable description of the structure of a dataset, its life cycle, and the services that operate upon it. This description is used by KXI services to self-configure and coordinate amongst themselves, and also provides room for user extension.
KXI services typically load their assembly configuration from file specified by the
KXI_ASSEMBLY_FILE environment variable. This file is represented in the YAML format, which allows for hierarchically-structured data, future extension, and inline comments.
An assembly has the following top-level structure:
name short name for this assembly (required) description purpose of the assembly (optional) labels user defined keys and values used for representing the purview of the assembly (optional) tables schemas for the tables operated upon within the assembly (dictionary) mounts mount points for stored data (dictionary) bus configuration of the message bus used for coordination between elements (dictionary) elements services that should run within the assembly, and any configuration they each require (dictionary)
This document focuses on the top level sections mentioned above. The components that typically would be under
elements are described in respective documentation.
Labels are used to define the purview for DA services contained within the assembly. That is, the data that it grants access to. If using the KX Insights Service Gateway, these are the values reported as the DAP's purview (see the Service Gateway page).
Below are some examples.
Example 1 - Provides FX data for America.
Example 2 - Provides electrical, weekly billing for residential customers.
Table schemas describe the metadata and columns of tables.
A table schema has the following structure:
|purpose of the table
|old name of the table, for automatic renaming
|names of primary key columns
|column to be used for storage partitioning
|column to hash for sharding
|block size for memory/disk manipulation
|name of the arrival timestamp column
|names of sort columns (in-memory)
|names of sort columns (on-disk IDB)
|names of sort columns (on-disk HDB)
|column schemas (see below)
A column schema has the following structure:
|name of the column
|purpose of the column
|q type name
|old name of the column, for automatic renaming
|foreign key into another table in this assembly in the form table.column
|column attribute when stored in memory
|column attribute when stored on disk
|column attribute when stored on disk with an
ordinal partition scheme
|column attribute when stored in object store (e.g. S3)
The list of supported column type values is:
boolean guid byte short int long real float char symbol
timestamp month date datetime timespan minute second time
booleans guids bytes shorts ints longs reals floats string symbols
timestamps months dates datetimes timespans minutes seconds times
or leave blank for a mixed type.
The list of supported column attribute values is:
grouped parted sorted unique
Use the grouped attribute for an in-memory column with a lot of repeated values. Use the parted attribute for an on-disk column where common values are adjacent. Use the sorted attribute for an in-memory column with ascending values, typically a time. Use the unique attribute for a column where all items are distinct, typically a primary key.
Attributes are metadata applied to table columns of special form and are often used to speed up query response times. See here for more information.
Assemblies store data in multiple places. The KXI Storage Manager (SM) component migrates data between a hierarchy of "tiers", each with its own locality, segmentation format, and rollover configuration. Other components might use entries in this section to coordinate other forms of data storage and access.
The Mounts section is a dictionary mapping user-defined names of storage locations to dictionaries with the following fields:
|base URI where that data can be mounted by other services
|partitioning scheme for this mount
|(object storage only) a
file:// URI or object storage URI path to a
|(object storage only) a
file:// URI or object storage URI to a
|(object storage only) an object storage URI that points to a kdb+ database
The full URI for mounting the local on-disk data is
current is a symbolic link pointing to a loadable kdb+ database).
none do not partition; store in arrival order
ordinal partition by a numeric virtual column which increments according to
a corresponding storage tier's schedule and resets
when the subsequent tier (if any) rolls over
date partition by each table's prtnCol column, interpreted as a date
- A mount of type
streammust have partition
- A mount of type
localmust have partition
date, and its URI must be of the form
<mount_root>/current, where the
<mount_root>directory is managed by the Storage Manager
The Bus provides information about whatever EMS-like system (or systems) is available to elements within this assembly for communication.
bus section consists of a dictionary of bus entries. The names
external are suggested for a bus used for communication within the assembly, and communication with the outside world (perhaps other assemblies), but assemblies may contain further entries for user-defined purposes.
Each bus entry provides:
|protocol of the messaging system
|subset of messages in this stream that consumers are interested in
|connection strings to machines or services which can be used for subscribing to this bus
rt use Insights Reliable Transport (RT)
custom use a custom solution that complies with RT interface. A custom q code module
should be loaded from the path given by an environment variable `KXI_RT_LIB`.
For this protocol, the `nodes` list should contain a single `hostname:port`.
Assemblies coordinate a number of components or processes, which we call elements of the assembly. The
elements section provides configuration details only relevant to specific services.
When the processes comprising the elements of an assembly are initialized, each will have access to the Assembly configuration. Furthermore, every process will know its element type, a short name describing its purpose. (The KXI components
dap are all element types, but an assembly might contain an open-ended collection of user-defined types.)
If processes are started with the environment variable
KXI_NAME, they can search for the configuration details of an element with this element name. Otherwise, they should look for configuration details for their element type. It is a fatal error for a process to launch as part of an assembly if its type is not listed in elements, or if there is an entry matching its name has the wrong type.
Each element entry has the following structure:
|the purpose of the element
|maps instance names to options dictionaries
|describes container name used for this element
Image dictionary contains the following:
|a URL indicating the image repository
|name of the image
|name/version of the image
Each element may have its own specific set of key/value configurations. Please refer to documentation of individual KXI components for information on element specific configurations.
Any other keys in the element entry will be applied to each item in
instances unless overridden.
An example can be found here
objecttype mount is currently not supported by SM. Its use is mainly for scenarios where SM is not part of the installation. ↩