Skip to content

Assembly Custom Resource

The Assembly CRD is used to define a scalable collection of data capture components and how they should coordinate amongst themselves.

Defining your own custom resource of kind: Assembly

Top level of the CR details the apiVersion of the CR and the kind.

Assembly name is set within metadata

apiVersion: insights.kx.com/v1
kind: Assembly
metadata:
  name: market-data
  labels:
    env: dev

labels allows additional custom labels to be set on all resources created from this assembly file.

Validation on the CR fields is carried out on the apply of the CR.

kubectl apply -f marketdata.yaml

On successfully applying your Assembly, it can be seen by calling a get on the assemblies resource.

kubectl get assemblies
NAME         DESCRIPTION      READY   STATUS   AGE
market-data  A KXI Assembly   True             24m

All resources under your Assembly can be seen by using the shared label insights.kx.com/app=<APP NAME>

Where APP_NAME will be your Assembly name

kubectl get sts,pods,svc -l insights.kx.com/app=market-data
NAME                                    READY   AGE
statefulset.apps/market-data-dap-hdb    3/3     100s
statefulset.apps/market-data-dap-idb    3/3     110s
statefulset.apps/market-data-dap-rdb    3/3     109s
statefulset.apps/market-data-sm         1/1     111s
statefulset.apps/rt-market-data-north   1/1     117s
statefulset.apps/rt-market-data-south   1/1     117s

NAME                         READY   STATUS    RESTARTS   AGE
pod/market-data-dap-hdb-0    2/2     Running   0          103s
pod/market-data-dap-hdb-1    2/2     Running   0          60s
pod/market-data-dap-hdb-2    2/2     Running   0          43s
pod/market-data-dap-idb-0    2/2     Running   0          113s
pod/market-data-dap-idb-1    2/2     Running   0          70s
pod/market-data-dap-idb-2    2/2     Running   0          51s
pod/market-data-dap-rdb-0    2/2     Running   0          112s
pod/market-data-dap-rdb-1    2/2     Running   0          53s
pod/market-data-dap-rdb-2    2/2     Running   0          22s
pod/market-data-sm-0         5/5     Running   0          114s
pod/rt-market-data-north-0   1/1     Running   0          2m
pod/rt-market-data-south-0   1/1     Running   0          2m

NAME                             TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)                      AGE
service/market-data-dap-hdb      ClusterIP   10.0.33.80    <none>        8080/TCP,5080/TCP            108s
service/market-data-dap-idb      ClusterIP   10.0.35.111   <none>        8080/TCP,5080/TCP            117s
service/market-data-dap-rdb      ClusterIP   10.0.32.127   <none>        8080/TCP,5080/TCP            117s
service/market-data-sm           ClusterIP   10.0.34.170   <none>        8080/TCP,10001/TCP           2m3s
service/rt-market-data-north-0   ClusterIP   10.0.34.180   <none>        8080/TCP,5001/TCP,5002/TCP   2m9s
service/rt-market-data-south-0   ClusterIP   10.0.35.187   <none>        8080/TCP,5001/TCP,5002/TCP   2m4s

Assembly Configuration

Assembly components and resources are configured under the spec key of the Assembly CR.

Some fields are required and will alert you if missing when applying your Assembly.

Optional fields are defaulted by the Operator, but may be overridden as part of your Assembly.

spec:
  description: "My Sample Assembly"
  attach: true
  env:
  - name: CUSTOM_ENV_VAR
    value: "customValue"
  - name: ANOTHER_ENV_VAR
    value: 123

spec.description

The description key is an optional string field, allowing you to give a brief description of your Assembly.

This field is used populate the description of the assembly config file read by each of the components.

Key Type Required Description Default Validation
description string false Assembly description "A KXI Assembly" Any string

spec.env

The optional env key allows for Environment Variables to be set for all components within the Assembly.

Custom Environment Variables are listed below the key and will be appended to existing Environment Variables for each of the components created by the Assembly.

Key Type Required Description
env list false List of Environment Variables

Expected structure for each item:

- name: ENV_NAME
  value: "A value"

spec.volumes

The optional volumes key allows for standard Kubernetes definitions of volumes to be applied to the Assembly.

Existing volumes may be added from outside of the Assembly CR.

Key Type Required Description
volumes list false List of Kubernetes volumes

spec.attach

The optional attach is a boolean field.

With this field set to true it allows users to attach to a container. Sets the tty and stdin to true on each of the containers created by the Assembly.

Key Type Required Description Default
attach boolean false Enable tty and stdin false

spec.tables

The tables key is an optional map field, allowing you to define each of the schema used within the Assembly.

Under they key tables each schema can be defined under its own key, representing the schema name.

spec:
...
  tables:
    trace:
      description: Manufacturing trace data
      type: partitioned
      blockSize: 10000
      prtnCol: updateTS
      sortColsOrd: [sensorID]
      sortColsDisk: [sensorID]
      columns:
        - name: sensorID
          description: Sensor Identifier
          type: int
          attrMem: grouped
          attrDisk: parted
          attrOrd: parted
        - name: readTS
          description: Reading timestamp
          type: timestamp
        - name: captureTS
          description: Capture timestamp
          type: timestamp
        - name: valFloat
          description: Sensor value
          type: float
        - name: qual
          description: Reading quality
          type: byte
        - name: alarm
          description: Enumerated alarm flag
          type: byte
        - name: updateTS
          description: Ingestion timestamp
          type: timestamp

Top level schema keys

Key Type Required Description Default Validation
description string false Description of table "KXI Assembly Schema" Any string
type string true Type of table splayed or partitioned
columns list true Columns of table
prtnCol string true A timestamp column name used for data partitioning ^[a-zA-Z0-9]+[a-zA-Z0-9_-]*[a-zA-Z0-9]+$
updTsCol string false A timestamp column name used for data latency monitoring ^[a-zA-Z0-9]+[a-zA-Z0-9_-]*[a-zA-Z0-9]+$
blockSize integer false Number of rows to keep in-memory before write to disk
primaryKeys []string false Names of primary key columns ^[a-zA-Z0-9]+[a-zA-Z0-9_-]*[a-zA-Z0-9]+$
sortColsMem []string false Names of columns to sort on when stored in memory ^[a-zA-Z0-9]+[a-zA-Z0-9_-]*[a-zA-Z0-9]+$
sortColsDisk []string false Names of columns to sort on when stored on disk ^[a-zA-Z0-9]+[a-zA-Z0-9_-]*[a-zA-Z0-9]+$
sortColsOrd []string false Names of columns to sort on when stored on disk with an ordinal partition scheme ^[a-zA-Z0-9]+[a-zA-Z0-9_-]*[a-zA-Z0-9]+$

spec.tables.columns

Under the columns key, columns within the schema are listed, detailing names and types.

Key Type Required Description Default Validation
description string false Description of column "KXI Assembly Schema Column" Any string
name string true Name of column ^[a-zA-Z0-9]+[a-zA-Z0-9_-]*[a-zA-Z0-9]+$
type string true Type of column See below
attrMem string false Column attribute when stored in memory sorted grouped parted or unique
attrDisk string false Column attribute when stored on disk sorted grouped parted or unique
attrOrd string false Column attribute when stored on disk with an ordinal partition scheme sorted grouped parted or unique
anymap boolean false Allow mapped lists to nest within other mapped lists as described here

The list of supported column type values is:

boolean guid byte short int long real float char symbol timestamp month date datetime timespan minute second time

booleans guids bytes shorts ints longs reals floats string symbols timestamps months dates datetimes timespans minutes seconds times

or leave blank for a mixed type.

spec.labels

The labels key is an optional map field, these provide a generic machine-readable representation of the purview of a given Assembly

spec:
...
  labels:
    type: basic
    assetClass: tick
Key Type Required Description
labels map false Key value pairs for Assembly file

This field is used populate the labels field of the assembly file read by each of the components.

spec.mounts

The mounts key is an optional map field, allowing you to define mounts to be applied to Assembly.

If a mount is seen to have a valid file path, file:/// prefix, then a PVC will be created.

Under they key mounts each mount can be defined under its own key, representing the mount name.

spec:
  mounts:
    rdb:
      type: stream
      baseURI: none
      partition: none
    idb:
      type: local
      baseURI: file:///data/db/idb
      partition: ordinal
      volume:
        storageClass: "rook-cephfs"
        size: "20Gi"
        accessModes:
          - ReadWriteMany
    hdb:
      type: local
      baseURI: file:///data/db/hdb
      partition: date
      dependency:
      - idb
      volume:
        storageClass: "rook-cephfs"
        size: "20Gi"
        accessModes:
          - ReadWriteMany
Key Type Required Description Default Validation
type string true Mount type stream local or object
description string false A string describing the purpose of this mount "KXI Assembly Mount" Any string
baseURI string true URI representing where that data can be mounted by other services Value of none or Filepath with prefix of either file:/// gs:// or ms://
partition string true Partitioning scheme for this mount none ordinal date month year
dependency []string false If mounted to container, any additional mounts required would be taken from here
volume object false Specifications for K8s PVC. See Mount Volumes

dependency

Where a mount contains historical data, a dependency of the related intra-day data mount should be specified. This a requirement as of the intra-day mount being the source of the sym file.

spec.mounts.volume

Under each declared mount you are able to configure the necessary Persistent Volume

A pre-existing PVC may be mounted to the Assembly by using the claimName field. Alternatively, populating the additional fields will create a new volume for you Assembly.

spec:
  mounts:
...
    data:
      type: local
      baseURI: file:///data/db/data
      partition: date
      volume:
        claimName: data-db-rook-cephfs
    hdb:
      type: local
      baseURI: file:///data/db/hdb
      partition: date
      dependency:
      - idb
      volume:
        storageClass: "rook-cephfs"
        size: "20Gi"
        accessModes:
          - ReadWriteMany
Key Type Required Description Default
claimName string false Pre-existing PVC claim name
storageClass string false K8s Storage Class ""
size string false K8s Storage size request "20Gi"
accessModes []string false Requested k8s access modes for PVC ReadWriteOnce

spec.bus

The bus key is an optional map field, allowing you to define buses to be applied to Assembly.

The operator will populate bus configuration from a subset of the configuration set under the spec.elements.sequencer field.

The bus field provides information about whatever EMS-like system (or systems) is available to elements within this Assembly for communication.

spec:
...
  bus:
    north:
      protocol: rt
      description: north Sequencer
      topic: basic-assembly-north
      topicPrefix: rt-
      nodes:
      - rt-basic-assembly-north-0:5001
Key Type Required Description Default Validation
protocol string true String indicating the protocol of the messaging system custom kraftmq aeron or rt
description string false A string describing the purpose of this Bus "Assembly communication bus" Any string
topic string false A string indicating the subset of messages in this stream that consumers are interested in ^[a-z0-9]+[a-z0-9-]*[a-z0-9]+$
topicPrefix string false Prefix to apply to Service name for Sequencer. Prefix followed by a required "-" ^[a-z0-9]+[a-z0-9-]*[a-z0-9]+-+$
nodes []string false A list of one or more connection strings to machines/services which can be used for subscribing to
credentials string false An environment variable indicating where elements should look for credentials related to this bus, if any are needed Any string
uri string false A URI describing an endpoint/connection string with which to make requests to a bus Any string

spec.elements

The elements key is a map field, allowing you to define each of the components to be deployed as part of the Assembly.

spec:
...
  elements:
    ....
Key Type Required Description
elements map true Map of component specifications within Assembly

spec.elements.dap

The dap field under elements allows you to optionally define DA instances within the Assembly.

The operator may be supplied image and port configuration details during installation, but these can be overridden within your Assembly CR.

spec:
...
  elements:
    dap:
      instances:
        idb:
          size: 3
          mountName: idb
        hdb:
          size: 3
          mountName: hdb
        rdb:
          size: 3
          tableLoad: empty
          mountName: rdb
          source: south
Key Type Required Description Default Validation
instances map false Map containing specifications on individual DA instances.
spec.elements.dap.instances

Under they key instances each DA instance can be defined under its own key, representing the instance name.

The operator will have defaults set for dap at install time, these cover target ports and image details.

spec:
...
  elements:
    dap:
      instances:
        idb:
          size: 3
          mountName: idb
Key Type Required Description Default Validation
size integer false Size of the StatefulSet to be deployed 3 Minimum 1
image object false Image details for container. See Container Image
env list false List of Environment Variables
args []string false Command line args to be passed to container
mountName string false Name of mount as defined in spec.mounts ^[a-z0-9]+[a-z0-9-]*[a-z0-9]+$
source string false Sequencer Bus to subscribe to as defined in spec.bus ^[a-z0-9]+[a-z0-9-]*[a-z0-9]+$
config map false Key-Value pair string map, allowing additional entries to the instance's assembly file
rtLogVolume object false RT Logs volume. See RT Logs Volume
volumeMounts list false List of standard Kubernetes Volume Mount definitions. Volume must be present in spec.volumes
customFile string false Path to additional file for DAP to load. May be full path or use mount name. e.g /full/path/file.q or $volumemount/file.q ^(\/\|\$[a-z0-9]([-a-z0-9]*[a-z0-9[])?)[a-zA-z0-9-_.\/]*.q[_]?$
k8sPolicy object false Kubernetes Pod configurations. See k8sPolicy

spec.elements.sm

The sm field under elements allows you to optionally define a Storage Manager instance within the Assembly.

The operator will have defaults set for sm at install time, these cover target ports and image details.

sm:
  size: 1
  source: south
  tiers:
    - name: streaming
      mount: rdb
    - name: interval
      mount: idb
      schedule:
        freq: 00:10:00
        snap: 00:00:00
    - name: recent
      mount: hdb
      schedule:
        freq: 1D00:00:00
        snap:   01:35:00
      retain:
        time: 3 Months
Key Type Required Description Default Validation
image object false Image details for container. See Container Image
env list false List of Environment Variables
args []string false Command line args to be passed to SM container
eoi object false SM EOI container details, see SM Sub Containers for details
eod object false SM EOD container details, see SM Sub Containers for details
dbm object false SM DBM container details, see SM Sub Containers for details
source string true Either a URI pointed at a static data source or the name of an entry in bus from which to obtain streaming data ^[a-z0-9]+[a-z0-9-]*[a-z0-9]+$
enforceSchema boolean false Enforce table schemas when persisting (with performance penalty; intended for debugging)
chunkSize integer false Chunk size employed by SM when writing tables Minimum 0
sortLimitGB integer false Memory limit when sorting splayed tables or partitions on disk, in GB Minimum 0
waitTm integer false Time between connection attempts in milliseconds Minimum 0
eodPeachLevel string false Level at which EOD peaches to parallelise HDB table processing part or table
disableREST boolean false Disables REST interface - only qIPC will be supported
config map false Key-Value pair string map, allowing additional entries to the instance's assembly file
rtLogVolume object false RT Logs volume. See RT Logs Volume
tiers list true A list of storage tiers. See SM Tiers
volumeMounts list false List of standard Kubernetes Volume Mount definitions. Volume must be present in spec.volumes
k8sPolicy object false Kubernetes Pod configurations. See k8sPolicy

spec.elements.sm.container

The Storage Manager element is a multi container StatefulSet. sm being the main container, configured at the top level within the SM element.

The additional sub containers of eoi, eod and dbm may be configured within a sub container object.

sm:
...
  eoi:
    args:
  eod:
    args:
...
Key Type Required Description
image object false Image details for container. See Container Image
port integer false Integer representing port of container
args []string false Command line args to be passed to EOI container
env list false List of Environment Variables
volumeMounts list false List of standard Kubernetes Volume Mount definitions. Volume must be present in spec.volumes
resources object false Set a container resource limits and requests. See k8sPolicy.resources for details on the resources key.

spec.elements.sm.tiers

The Storage Manager element requires that tiers are configured. Tiers describe the locality, segmentation format, and rollover configuration of each storage tier.

The tiers object is a list, with multiple entries defined.

sm:
...
  tiers:
    - name: streaming
      mount: rdb
    - name: interval
      mount: idb
      schedule:
        freq: 00:10:00
        snap: 00:00:00
    - name: recent
      mount: hdb
      schedule:
        freq: 1D00:00:00
        snap:   01:35:00
      retain:
        time: 3 Months
Key Type Required Description Validation
name string true A string used to refer to a particular tier ^[a-z0-9]+[a-z0-9-]*[a-z0-9]+$
description string false A string describing the purpose of this tier Any string
store string false A URI describing where this tier will physically store data Value of realtime or Filepath with prefix of either file:/// gs:// or ms://
mount string true The name of a corresponding mounts entry ^[a-z0-9]+[a-z0-9-]*[a-z0-9]+$
schedule object false Policy for when rollovers should be considered. See Tier Schedule
retain object false Policy for how much data should be stored in this tier before it is rolled over into the next tier. See Tier Retain
compression object false Policy for compression of data, if any. See Tier Compression

spec.elements.sm.tiers.schedule

A tiers schedule object defines a tiers rollover policy.

Rollover can take place at a set frequency and/or at snapshot intervals.

sm:
...
  tiers:
    - name: streaming
      store: realtime
    - name: interval
      mount: idb
      schedule:
        freq: 00:10:00
        snap: 00:00:00
Key Type Required Description Validation
freq string false A timespan, in Q notation. How often should this tier roll data over into the next tier - e.g. 00:00:00 or 0D00:00:00 format ^([0-9]*D)?([0-1]?[0-9]\|2[0-3]):[0-5][0-9]:[0-5][0-9]$
snap string false A time, in Q notation. At what whole multiples of time should rollovers be scheduled - e.g. 00:00:00 or 0D00:00:00 format ^([0-9]*D)?([0-1]?[0-9]\|2[0-3]):[0-5][0-9]:[0-5][0-9]$

spec.elements.sm.tiers.retain

A tiers retain object defines a tiers policy on data retention, before being rolled over into another tier.

sm:
...
  tiers:
    - name: recent
      mount: hdb
      retain:
        time: 3 Months
Key Type Required Description Validation
time string false A timespan consisting of a number followed by a unit, e.g. 2 Years. Rollover occurs for data which has been stored for this length of time ^[1-9]+[0-9]* ?(Years\|Months\|Weeks\|Days\|Hours\|Minutes)$
size string false A size in byte units consisting of a number followed by a unit {EB,TB,GB,MB,KB}, e.g. 2 TB. Rollover occurs for data which would exceed this permitted capacity ^[1-9]+[0-9]* ?(EB\|TB\|GB\|MB\|KB)$
sizePct integer false A size as percentage of total storage of corresponding mount, specified as a number from 1 to 100 1 - 100
rows integer false Rollover occurs for data beyond this number of rows Minimum 1

spec.elements.sm.tiers.compression

A tiers compression object defines a tiers policy on data compression. The compression policy currently applies only to tiers associated with a mount of type:local and partition:date

sm:
...
  tiers:
    - name: recent
      mount: hdb
      compression:
        algorithm: gzip
        level: 9
        block: 1000
Key Type Required Description Validation
algorithm string false A compression algorithm none, qipc, gzip, snappy or lz4hc
block integer false Block size
level integer false Compression level

spec.elements.sequencer

The sequencer field under elements allows you to optionally define multiple Sequencer instances within the Assembly.

The operator will have defaults set for sequencer at install time, these cover target ports and image details.

Under they key sequencer each Sequencer instance can be defined under its own key, representing the instance name.

spec:
...
  elements:
...
   sequencer:
      north:
        size: 3
        external: true
        externalNodePort: true
        protocol: "rt"
        topicConfig:
          subTopic: "data"
Key Type Required Description Default Validation
size integer false Size of the StatefulSet to be deployed 3 Limited to 3
external boolean true External facing Sequencer, setting true enables External IP "false"
externalNodePort boolean true Use Node Port Type for externally facing Sequencer service "false"
image object false Image details for container. See Container Image
env list false List of Environment Variables
args []string false Command line args to be passed to container
topicConfig object false Sequencer Topic Configurations See Sequencer Topics Config
volume object false RT Sequencer directory paths. See RT Volume
topicConfigDir string false Location of RT 'pull' directory "/config/topics/" ^[\/]+[a-zA-Z0-9\/-_]*$
volumeMounts list false List of standard Kubernetes Volume Mount definitions. Volume must be present in spec.volumes
k8sPolicy object false Kubernetes Pod configurations. See k8sPolicy
archiver object false Sequencer Archiver. See Sequencer Archiver

spec.elements.sequencer.topicConfig

The topicConfig object allows the topic configuration to be set for a Sequencer. This is the topic subscribed to by the sequencer and published.

spec:
...
  elements:
...
    sequencer:
      south:
        external: false
        subTopic: "north"
      north:
        external: true
        topicConfig:
          subTopic: "data"
          topic: "north"
Key Type Required Description Default Validation
topicPrefix string false Prefix to apply to Service name for Sequencer. Prefix followed by a required "-" "rt-" ^[a-z0-9]+[a-z0-9-]*[a-z0-9]+-+$
subTopic string false Topic for Sequencer to subscribe to ^[a-z0-9]+[a-z0-9-]*[a-z0-9]+$
topic string false Topic for generated by Sequencer ^[a-z0-9]+[a-z0-9-]*[a-z0-9]+$
externalTopicPrefix string false Prefix to apply to Service name for Sequencer.Prefix followed by a required "-" ^[a-z0-9]+[a-z0-9-]*[a-z0-9]+-+$

spec.elements.sequencer.volume

The volume object allows you to configure the Sequencers RT log volume. This is the volume container the sequencer logs for state, subscribing and publishing topics.

spec:
...
  elements:
...
    sequencer:
      south:
        volume:
          mountPath: "/s/"
          subPaths:
            in: "in"
            out: "out"
            cp: "state"
          size: "20Gi"
Key Type Required Description Default Validation
mountPath string false Mount location of volume "/s/" ^[\/]+[a-zA-Z0-9\/-_]*$
accessModes []string false Requested k8s access modes for PVC
storageClass string false K8s Storage Class
size string false K8s Storage size request "20Gi"
subPaths object false Sub directories under Mount location
subPaths.in string false Location of RT 'in' sub directory "in" ^[a-zA-Z0-9-_]+$
subPaths.out string false Location of RT 'out' sub directory "out" ^[a-zA-Z0-9-_]+$
subPaths.cp string false Location of RT 'cp' sub directory "state" ^[a-zA-Z0-9-_]+$

spec.elements.sequencer.archiver

Each Sequencer has the option to enable and Archiver deployment. This Archiver deployment is used for clearing the Sequencers log file, based on log size or age.

spec:
...
  elements:
...
    sequencer:
      south:
        archiver:
          retentionDuration: 10080
          maxLogSize: 5
Key Type Required Description Default Validation
retentionDuration integer false Log retention in minutes
maxLogSize string false Maximum log size ^([+-]?[0-9.]+)([eEinukmgtpKMGTP]*[-+]?[0-9]*)$
maxDiskUsagePercent integer false Max disk utilization

spec.elements.sp

The sp field under elements allows you to optionally define Stream Processor pipelines within the Assembly.

The operator will have defaults set for sp at install time, these cover target ports and image details.

spec:
...
  elements:
...
    sp:
      description: Processor of streams
      pipelines:
        sdtransform:
          type: spec
          protectedExecution: false
          source: north
          destination: south
          minWorkers: 1
          maxWorkers: 1
          workerThreads: 4
          spec: |-
              sensor: ([]sensorID:`g#"i"$();extSensorID:`$();name:`$();typ:"x"$();createTS:"p"$();updateTS:"p"$());
              trace: ([]sensorID:`g#"i"$();readTS:"p"$();captureTS:"p"$();valFloat:"f"$();qual:"h"$();alarm:"x"$();updateTS:"p"$());

              .enum.alarm:``NORMAL`HIGH!(::;0x01;0x02)
              .enum.qual:``GOOD!(::;0x01)

              // Incoming event format
              // list(symbol;dict)

              // Transformations:
              // - format into table
              // - scale values
              // - translate timestamps
              // - set alarm based off values
              // - sort by sensorID and readTS
              traceMapFn:{
                  rawData:x 1;
                  svals:rawData[`val]*rawData`scaling;
                  rts:rawData[`ts]+rawData`timeOffset;
                  (`trace;
                    `sensorID`readTS xasc flip cols[trace]!(rawData`id;rts;rawData`ts;svals;.enum.qual`GOOD;
                      ?[5000f<svals;.enum.alarm`HIGH;.enum.alarm`NORMAL];.z.p)
                  )
                  }

              logLatency:{
                  if[count l:.qsp.get[`latencyCacher; ::];log.info("Approximate ingress ems latency, %N";l)];
                  }

              .tm.add[`latency;(`logLatency;::);10000;0];

              .qsp.run
                  .qsp.read.fromRT[]
                  .qsp.map[{[op;md;data] .qsp.set[op;md;.z.p-data 2];data}; .qsp.use`name`state!(`latencyCacher; ())]
                  .qsp.filter[{`trace=x 0}]
                  .qsp.map[traceMapFn]
                  .qsp.write.toRT[]
Key Type Required Description Default Validation
description string false SP Workers description "SP Pipelines" Any string
pipelines map false Map of SP Pipelines

spec.elements.sp.pipelines

Under they key pipelines each Pipeline can be defined under its own key, representing the pipeline name.

Key Type Required Description Validation
type string false Set Pipeline type graph or spec
group string false Groups a pipeline into a set of replicas that have a matching group id ^[a-z0-9]+[a-z0-9-]*[a-z0-9]+$
env list false List of Environment Variables
protectedExecution boolean false Enable Protected Execution
secrets []string false Pre-configured Kubernetes secrets to inject into pipeline ^[a-z0-9]+[a-z0-9-]*[a-z0-9]+$
configMaps []string false Pre-configured Kubernetes config maps to inject into pipeline ^[a-z0-9]+[a-z0-9-]*[a-z0-9]+$
imagePullSecrets []string false Pre-configured Kubernetes imagePullSecrets to inject into pipeline ^[a-z0-9]+[a-z0-9-]*[a-z0-9]+$
base string false Sets the pipeline worker image to one of the prebuilt workers depending on language and functionality. Supported platforms are q, machine learning + q or Python. q, q-ml, py or py-ml
controllerImage object false Override the default Controller Image. See Container Image
workerImage object false Override the default Worker Image. See Container Image
minWorkers integer false Minimum worker instances Minimum 1
maxWorkers integer false Maximum worker instances Minimum 1
workerThreads integer false Worker thread count Minimum 1
spec string true Pipeline spec
source string false Sequencer Bus to subscribe to ^[a-z0-9]+[a-z0-9-]*[a-z0-9]+$
destination string false Sequencer Bus to publish to ^[a-z0-9]+[a-z0-9-]*[a-z0-9]+$
persistence object false Persistence configuration for controller and worker containers, See SP Persistence
config map false Key-Value pair string map, allowing additional entries to the instance's assembly file
controllerK8sPolicy object false Kubernetes Pod configurations. See k8sPolicy
workerK8sPolicy object false Kubernetes Pod configurations. See k8sPolicy

spec.elements.sp.pipelines.persistence

The SP persistence object allows you to define checkpoints for both the controller and worker within a pipeline. A PVC may be defined as well as the frequency of each checkpoint.

spec:
...
  elements:
...
    sp:
      description: Processor of streams
      pipelines:
        sdtransform:
          persistence:
            controller:
              disabled: false
              checkpointFreq: 1000
              size: "100Gi"
              class: ""
            worker:
              disabled: true
Key Type Required Description Validation
controller object false Persistence configuration for controller container
controller.disabled boolean false Disable persistence for controller
controller.checkpointFreq integer false Checkpoint frequency in milliseconds
controller.size string false Volume size for checkpoint ^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$
controller.class string false Requested volume Storage class
worker object false Persistence configuration for worker container
worker.disabled boolean false Disable persistence for worker
worker.checkpointFreq integer false Checkpoint frequency in milliseconds
worker.size string false Volume size for checkpoint ^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$
worker.class string false Requested volume Storage class

Advanced Configurations

Defaults supplied by the operator during installation may be overridden within your Assembly CR definition.

Element Container

Each element will have a least one image field. Some elements will have the option to define multiple images, in relation their deployment.

Each of those image fields share the same structure.

spec:
...
    image:
      repo: "image.repo.custom.com/"
      component: "component"
      tag: "1.2.3"
Key Type Required Description
repo string false Image repository
component string false Image component name
tag string false Image tag

RT Logs Volume

The spec.elements.sm and spec.elements.dap services allow you to define the location and size of the rt logs PVC for storing Sequencer logs. This is key to ensure your services have enough disk space depending on your ingestion profile.

spec:
...
    rtLogVolume:
      mountPath: "/logs/rt/"
      persistLogs: true
      size: "20Gi"

Key Type Required Description Default Validation
persistLogs boolean false Create PVC for RT Logs true
mountPath string false Mount location of volume "/logs/rt/" ^[\/]+[a-zA-Z0-9\/-_]*$
accessModes []string false Requested k8s access modes for PVC ReadWriteOnce
storageClass string false K8s Storage Class
size string false K8s Storage size request "20Gi"

spec.sandbox

The operator may deploy Assemblies as sandboxes.

The sandbox key is an optional boolean field. It tells the Operator to deploy Assembly as Sandbox, setting mounts to Read Only and prevents PVCs from being created.

spec:
  sandbox: true
Key Type Required Description
sandbox boolean false Deploy Assembly as a sandbox

spec.imagePullSecrets

The operator may be supplied imagePullSecrets during installation.

The optional list field imagePullSecrets allows additional secrets to be supplied for this Assembly.

spec:
  imagePullSecrets:
    - name: image-secret-cred
Key Type Required Description
imagePullSecrets list false List of image pull secrets

spec.imagePullPolicy

The optional string field imagePullPolicy allows the image pull policy to be set for all Assembly components. Kubernetes by default will apply IfNotPresent.

spec:
  imagePullPolicy: "Always"
Key Type Required Description
imagePullPolicy string false Image pull policy to apply to all components

spec.podSecurityContext

Under the spec key of the CR, you define the podSecurityContext for your Assembly.

spec:
  podSecurityContext:
    fsGroup: 1000
    runAsUser: 1000

The podSecurityContext key is optional. When provided it allows the user to set fsGroup key and runAsUser key for their Assembly.

Key Type Required Description
podSecurityContext object false Pod Security
podSecurityContext.fsGroup integer false Any files within Assembly will be owned by this user ID
podSecurityContext.runAsUser integer false Any Containers in the Assembly, all processes run with this user ID

Defaults or set values will be applied to all resources created from the Assembly.

spec.license

The operator may be supplied license details for the KX On Demand License during installation.

The optional license key allows custom license details to be applied to the Assembly. These will be used for each of the components created from the Assembly.

spec:
  license:
    lic_user: "User Name"
    lic_email: "u.name@custom.com"
    lic_secret: "my-kx-secret"
    lic_type: "onDemand"
    kxAcct: "insights.kx-acc-svc:5000"
Key Type Required Description
license object false KX License details
license.lic_user string false License owner name
license.lic_email string false License owner e-mail address
license.lic_secret string false Name of pre-existing secret containing KX License
license.lic_type string false Name of pre-existing secret containing KX License
license.kxAcct string false KX Account aggregator service endpoint

spec.qlog

The operator may be supplied qlog configuration during installation.

Custom configuration for qlog may be set within the Assembly.

Qlog configuration is applied to all containers created by the Assembly.

spec:
...
  qlog:
    directory: "/opt/kx/config"
    formatMode: "text"
    endpoints:
      - "fd://stdout"
    routings:
      ALL: "INFO"

Each default can be overridden by setting the relevant key under qlog

Key Type Required Description Default Validation
directory string false Directory path to mount qlog config "/opt/kx/config" ^[\/]+[a-zA-Z0-9\/-_]*$
endpoints []string false List of endpoints (can provide multiple) {"fd://stdout"}
formatMode string false QLog logging format "json" text or json
routings map false ALL is the default routing, can also use ALL as a wildcard for all levels {"":"INFO"}

spec.sideCar

The operator may be supplied sideCar configuration during installation.

Custom configuration for sideCar may be set within the Assembly.

Sidecar configuration is applied to all containers created from the Assembly.

spec:
...
  sideCar:
    image:
      repo: "image.repo.custom.com/"
      component: "kxi-sidecar"
      tag: "0.9.0"
    port: 8080
    frequencySecs: 5
    configDir: "/opt/app/config/"
Key Type Required Description Default Validation
image object false Image details for Side Car container
image.repo string false Image repository
image.component string false Image component name
image.tag string false Side Car image tag
port integer false Side Car container port "8080"
frequencySecs integer false Frequency of connection attempt from Side Car to main container 5
configDir string false Mount location of Side Car configuration map "/opt/kx/sideCarConfig/"

spec.discovery

The operator may be supplied discovery configuration during installation.

Custom configuration for discovery may be set within the Assembly.

Discovery configuration is applied to all containers created from the Assembly.

spec:
...
  discovery:
    enabled: true
    registry: "kxi-discovery-service:8761"
    heartbeatSecs: 30
    leaseExpirySecs: 90
    callTimeoutSecs: 10
    maxPeriodRetrySecs: 30
    refreshServicesSecs: 60
Key Type Required Description Default Validation
enabled boolean false Enable Discovery on each of the Assembly components
callTimeoutSecs integer false Time until a REST request to Discovery Service is considered timed out in seconds 10
heartbeatSecs integer false Rate at which the heartbeat request to the Discovery service is made in seconds 30
leaseExpirySecs integer false Defines the period of time the Discovery service will wait before after a failed heartbeat before evicting the application 90
maxPeriodRetrySecs integer false Maximum period of time in seconds between REST request retires to Discovery Service 30
refreshServicesSecs integer false Time in seconds between Services refresh from Discovery Service 60
registry string false Discovery Service URL

spec.metrics

The operator may be supplied metrics configuration during installation.

Custom configuration for metrics may be set within the Assembly.

Metrics configuration is applied to all containers created from the Assembly.

spec:
...
  metrics:
    enabled: true
    frequency: 5
    handler:
      po: true
      pc: true
      wo: true
      wc: true
      pg: true
      ps: true
      ws: true
      ph: true
      pp: true
      ts: true
    useAnnotations: false
    serviceMonitor:
      enabled: true
      interval: "10s"
      path: "/metrics"
      release: kx-prom
Key Type Required Description Default Validation
enabled boolean false Enable metrics on each of the Assembly components
frequency integer false Frequency in seconds that Side Car will scrape metrics from main container 5
handler map false Enable or disable capture of .z. handler metrics
handler.po boolean false Enable metrics collection of the .z.po handler true
handler.pc boolean false Enable metrics collection of the .z.pc handler true
handler.wo boolean false Enable metrics collection of the .z.wo handler true
handler.wc boolean false Enable metrics collection of the .z.wc handler true
handler.pg boolean false Enable metrics collection of the .z.pg handler true
handler.ps boolean false Enable metrics collection of the .z.ps handler true
handler.ws boolean false Enable metrics collection of the .z.ws handler true
handler.ph boolean false Enable metrics collection of the .z.ph handler true
handler.pp boolean false Enable metrics collection of the .z.pp handler true
handler.ts boolean false Enable metrics collection of the .z.ts handler true
useAnnotations boolean false Where Metrics has been enabled, and ServiceMonitor disabled, annotations may be applied to the Pod to allow Metrics scraping
serviceMonitor map false Service monitor details for Assembly
serviceMonitor.enabled boolean false Enable the Service Monitor resource for the Assembly components
serviceMonitor.interval string false Service monitor scrape interval "10s" ^((([0-9]+)y)?(([0-9]+)w)?(([0-9]+)d)?(([0-9]+)h)?(([0-9]+)m)?(([0-9]+)s)?(([0-9]+)ms)?\|0)$
serviceMonitor.path string false Service monitor target metics endpoint "/metrics" ^[\/]+[a-zA-Z0-9-_]*[a-zA-Z0-9]+$
serviceMonitor.release string false Existing prometheus release name ^[a-zA-Z0-9]+[a-zA-Z0-9-]*[a-zA-Z0-9]+$

spec.resourceAnnotation

The optional object field resourceAnnotation allows annotations to be defined for each of the Assembly resources. Any resource generated by this Assembly will have these annotations defined.

spec:
  resourceAnnotation:
    simple.annotation: value

k8sPolicy

Each element defined within the Assembly has a k8sPolicy object. This object allows additional kubernetes configurations to be applied to the StatefulSets created by the Assembly CR.

spec:
...
  elements:
    dap:
      instances:
        idb:
          k8sPolicy: {}
        hdb:
          k8sPolicy: {}
    sm:
      k8sPolicy: {}
    sequencer:
      north:
        k8sPolicy: {}
      south:
        k8sPolicy: {}

Within the k8sPolicy the instances ServiceAccount and PodSecurityContext can be defined.

Additional advanced configuration can be set.

k8sPolicy.serviceAccount

The serviceAccount field allow the user to define a service account name to be used.

k8sPolicy:
  serviceAccount: "my-svc-acc"

This may be a new service account name, or the name of a preexisting service account. If preexisting, the create field of serviceAccountConfigure should be set to false

k8sPolicy.serviceAccountConfigure

The serviceAccountConfigure field allows the user to override defaults for a components Service Account.

k8sPolicy:
  serviceAccountConfigure:
    create: false
    automountServiceAccountToken: false

By default the KXI Operator will create a service account for each component.

This configuration allows for the creation to be disabled, and the automatic mounting of a service account token to also be disabled. See here for additional information.

Key Type Required Description
create boolean false Create a service account for component
automountServiceAccountToken boolean false Mount the service account token to the container

k8sPolicy.resources

The resources field allows the user to define container resources.

k8sPolicy:
  resources:
    requests:
      memory: "64Mi"
      cpu: "250m"
    limits:
      memory: "128Mi"
      cpu: "500m"

Resource configuration allows limits and requests to be set for containers. The kubernetes scheduler will attempt to place pod on a node with sufficient resources to meet a request and will enforce limits on a pod.

Resource Limits

When a process in the container tries to consume more than the allowed amount of memory, the system kernel terminates the process that attempted the allocation, with an out of memory (OOM) error

Key Type Required Description
resources object false Parent object to define requests and limits
resources.requests object false Requested resources for Pod container
resources.requests.memory string false Requested container memory in bytes. You can express memory as a plain integer or as a fixed-point number. See here for more details
resources.requests.cpu string false Requested container cpu in units of Kubernetes CPUs.
resources.limits object false Enforce resource limits on a Pod's container
resources.limits.memory string false Enforced maximum memory in bytes. You can express memory as a plain integer or as a fixed-point number. See here for more details
resources.limits.cpu string false Enforced cpu usage limit in units of Kubernetes CPUs.

k8sPolicy.nodeSelector

The nodeSelector field allows the user to define nodeSelector configuration for Pods.

k8sPolicy:
  nodeSelector:
    disktype: ssd

It specifies a map of key-value pairs. For the pod to be eligible to run on a node, the node must have each of the indicated key-value pairs as labels (it can have additional labels as well). The most common usage is one key-value pair.

Key Type Required Description
nodeSelector object false Map of key-value pairs

k8sPolicy.affinity

The affinity field allows the user to define Affinity and anti-affinity configuration for Pods.

k8sPolicy:
  affinity:
    nodeAffinity:
      ...
    podAffinity:
    ...
    podAntiAffinity:

Pod affinity allows for more control over pod scheduling. Where nodeSelector may be used for simple scheduling, podAffinity allows for greater range of constraints.

There are currently two types of Node affinity, called requiredDuringSchedulingIgnoredDuringExecution and preferredDuringSchedulingIgnoredDuringExecution

requiredDuringSchedulingIgnoredDuringExecution will only schedule pods on nodes matching criteria, preferredDuringSchedulingIgnoredDuringExecution will attempt to schedule but on failure will run on non-matching nodes.

Inter-pod affinity and anti-affinity allow you to constrain which nodes your pod is eligible to be scheduled based on labels on pods that are already running on the node rather than based on labels on nodes.

The rules are of the form "this pod should (or, in the case of anti-affinity, should not) run in an X if that X is already running one or more pods that meet rule Y". Y is expressed as a LabelSelector with an optional associated list of namespaces

As with nodeAffinity there are two types of pod affinity and anti-affinity, called requiredDuringSchedulingIgnoredDuringExecution and preferredDuringSchedulingIgnoredDuringExecution

Key Type Required Description
affinity.nodeAffinity object false Node Affinity is a set of conditions for a node to meet for Pod scheduling on a node
affinity.podAffinity object false Pod Affinity is a set of conditions for additional Pods to be met for Pod scheduling on a node
affinity.podAntiAffinity object false Pod Anti-affinity is a set of conditions for additional Pods to be met for Pod scheduling on a node

k8sPolicy.tolerations

The tolerations field allows the user to define Tolerations configuration for Pods.

k8sPolicy:
  tolerations:
  - key: "example-key"
    operator: "Exists"
    effect: "NoSchedule"

Nodes may be tainted, as means to prevent certain pods from being scheduled on that node. a taint is a key-value pair, several taints are pre-existing and set by Kubernetes.

Tolerations are applied to pods, and allow (but do not require) the pods to schedule onto nodes with matching taints.

Key Type Required Description
tolerations object false Pod node taint tolerations

k8sPolicy.terminationGracePeriodSeconds

The terminationGracePeriodSeconds field allows the user to define terminationGracePeriodSeconds configuration for Pods.

k8sPolicy:
  terminationGracePeriodSeconds: 60

This sets the length of time in seconds the container will be allowed to shutdown before the pod if killed.

Key Type Required Description Default
terminationGracePeriodSeconds integer false Kubernetes Pod configurations 30

k8sPolicy.podSecurityContext

The podSecurityContext field allows for the Pod Level Security Context to be configured for the element pods.

k8sPolicy:
  podSecurityContext:
    runAsUser: 1000
    runAsNonRoot: true
    fsGroup: 1000

It holds pod-level security attributes and common container settings. Some fields are also present in container.securityContext. Field values of securityContext take precedence over field values of PodSecurityContext.

k8sPolicy.securityContext

The securityContext allows for Container Level Security Context to be configured.

k8sPolicy:
  securityContext:
    runAsUser: 1000
    runAsNonRoot: true
    fsGroup: 1000
    readOnlyRootFilesystem: true
    allowPrivilegeEscalation: false

It holds security configuration that will be applied to a container. Some fields are present in both SecurityContext and podSecurityContext. When both are set, the values in securityContext take precedence.