Skip to content

Query configuration

The kdb Insights Database uses Data Access Processes (DAPs) to serve data for queries. DAPs are split into tiers based on the age of the data. Tiers are typically split into a real-time database (RDB), an intra-day database (IDB), and a historical database (HDB). DAPs are configured under a dap key of the elements field within an assembly file. The data of a given tier is maintained by the mount or mounts for that tier. Different scaling configurations are possible depending on how many tiers are provided to a DAP.

Deployment

To see a deployment example of data access processes with the other components of a database, see the docker deployment example.

User interface configuration

This guide discusses configuration using YAML files. If you are using kdb Insights Enterprise, you can configure your system using the Insights user interface

Configuration

Configuration for the Data Access Process is nested under an elements.dap key within an assembly file.

In a microservice deployment, the elements key is top level within the assembly file.

elements:
  dap:
    instances:
      da:
        mountList: [rdb, idb, hdb]

In an enterprise deploy, the elements key is nested under a spec key within the assembly file.

spec:

  # Other fields ..

  elements:
    dap:
      instances:
        da:
          mountList: [rdb, idb, hdb]
name type required description
mountName string No References the name of the mount this DAP will use to surface data. Either mountName or mountList (but not both) must be provided depending on the desired scaling mode. Providing mountName will use single mount mode.
mountList string[] No References a set of mounts this DAP will use to surface data. Either mountList or mountName (but not both) must be provided depending on the desired scaling mode. Providing mountList will use multiple mount mode.
mapPartitions boolean No (Microservices only) Used to map partitions from on-disk tables into memory. This will consume more memory but provides faster query results. Defaults to being disabled. See .Q.MAP for additional details about mapping partitions.
pctMemThreshold float No This threshold limits the amount of memory that is used before the DAP triggers a cache flush. This value is a decimal value between 0 and 1. When this value is exceeded, the DAP will enter low memory mode until the next writedown interval completes.
allowPartialResults boolean No (Microservices only) Indicates if partial query results should be sent to the aggregator when this DAP is in low memory mode. Defaults to being enabled.
enforceSchema boolean No (Microservices only) Checks incoming data against the target schema. This adds additional overhead on ingest but ensure data is valid before trying to append it to the database. Defaults to disabled.
rcEndpoints string[] No (Microservices only) The URL of the resource coordinator to use for this DAP.

Enterprise configuration

When running DAPs in an Enterprise deployment, some additional configuration fields can be provided for scaling and Kubernetes integration. The table below outlines the additional properties.

name type required description
source string No The source field is required for any DAPs that have a stream mount. This field must point to an existing configured stream name. r
size string No This is the total number of instance replicas to deploy for a given DAP. Increasing this value provides higher query availability and can increase the number of concurrent queries the database can service. Each instance will require its own resources for memory and CPU. The default number of instances is 3.
env object[] No Additional environment variables can be added to tune and customize the configuration for this DAP instance. See DAP environment variables below for details.
args string[] No Additional command line arguments can be passed to the q process to customize behavior. See command line arguments for details.
image object No A custom image can be provided to override the version included with this install. This object must include a repo, container and tag argument pointing to the desired DAP image. See the image object configuration below for an example.
k8sPolicy object No Allows you to specify additional Kubernetes configuration for DAP instances. This configuration can be used to modify process availability and resource limits.
rtLogVolume object No The Reliable Transport log volume is the configuration for the DAP's local copy of real-time event data. The log volume configuration must be large enough to hold stream data for the RT archiver.
volumeMounts string[] No Extra volumes to mount on this DAP. These volumes must be listed in spec.volumes of the assembly file. This can be used to mount additional data to the DAP or for mounting custom code.

Scaling configurations

DAPs can be configured in two different modes depending on the anticipated query requirements. DAPs can either be configured to scale independently per data tier, or uniformly across all data tiers.

Scaling Independently

Scaling independently

Scaling independently means that you can have more RDBs than IDBs or HDBs, or vice versa. This allows you to tailor your setup to match the anticipated query distribution across the data tiers to maximize query throughput. Scaling independently means that each tier will consume its own set of resources and will run its own container. To use this mode, configure your DAP with the mountName configuration option.

elements:
  dap:
    instances:
      rdb:
        mountName: rdb
      idb:
        mountName: idb
      hdb:
        mountName: hdb

Scaling Uniformly

Scaling uniformly

To share container resources, you can scale your DAPs uniformly in a single container. This mode is typically referred to as single DAP mode. In this mode, RDBs, IDBs and HDBs are all within a single container. In this mode, adding another instance adds another copy of all configured tiers. To use this mode, configure your DAP with the mountList configuration option.

elements:
  dap:
    instances:
      db:
        mountList: [rdb, idb, hdb]

Environment Variables

Advanced configuration can be supplied to a DAP using environment variables. Environment variables are configured differently depending on the method of deployment. In all cases, the variables are always string values.

In Docker, environment variables are supplied using under an environment key for the target service as a list of key-value pairs.

services:
  db:
    image: ${kxi_da}
    environment:
      - KXI_NAME=db

In kdb Insights Enterprise, within an assembly, environment variables have to be set for each DAP instance. Environment variables are supplied under the env as a list of objects where each is a pair of name and value.

spec:
  elements:
    dap:
      instances:
        db:
          env:
            - name: KXI_NAME
              value: "db"

In kdb Insights Enterprise, variables can be supplied in the user interface under the advanced query settings option.

name description
KXI_NAME Process name.
KXI_SC Service class for data access (e.g. RDB, IDB, HDB). Must match value in elements.dap.instances of assembly.
KXI_ASSEMBLY_FILE Assembly YAML file.
KXI_PORT Port.
KXI_CUSTOM_FILE File containing custom code to load in DA processes.
KXI_DAP_SANDBOX Whether this DAP is a sandbox (default: false).
KXI_ALLOWED_SBX_APIS Comma-delimited list of sandbox APIs to allow in non-sandbox DAPs (e.g. .kxi.sql,.kxi.qsql).
KXI_DA_RELOAD_STAGGER Time in seconds between DAPs of the same class reloading after an EOX (default: 30).
KXI_DA_USE_REAPER Whether to use KX Reaper and object storage cache (default: false).
KXI_SAPI_HB_FREQ Time in milliseconds to run the heartbeat to connected processes (default: 30000).
KXI_SAPI_HB_TOL Number of heartbeat intervals a process can miss before being disconnected (default: 2).
KXI_GC_FREQ Frequency in milliseconds to run garbage collect in a timer (default: 600000, set to 0 to disable).
KXI_ENABLE_FLUSH Set to "true" to enable async flush on messages from DA to Agg (default false).
KXI_RT_EVENT_FATAL If "true", RT badtail and badmsg events are treated as fatal; SM crashes and ingestion stops. If "false" or unspecified, events are logged but ingestion continues. Note that reset events are never treated as fatal.
KXI_SG_RC_ADDR A URL address for an explicit Resource Coordinator for this specific DAP instance to connect to. This must be a fully qualified host name and port. If not specified, the DAP will fallback to Kubernetes label discovery, or kdb Insights Service Discovery.
SBX_MAX_ROWS Maximum number of rows, per partitioned table, to store in memory.
KX_OBJSTR_INVENTORY_FILE Set to path relative to the root of the bucket to use an inventory file.

In addition, the following environment variables apply to both the sidecar and DAP images.

name container description
KXI_CONFIG_FILE sidecar Discovery configuration file.
KXI_LOG_FORMAT ALL Log message format.
KXI_LOG_DEST ALL Log endpoints.
KXI_LOG_LEVELS ALL Component routing.
KXI_LOG_CONFIG ALL Alternative logging configuration: replaces KXI_LOG_FORMAT, KXI_LOG_DEST, and KXI_LOG_LEVELS.