Skip to content

Storage Manager configuration

The Storage Manager (SM) takes its configuration from the Assembly Configuration file specified by KXI_ASSEMBLY_FILE environment variable.

SM expects the following sections to be specified in the assembly.

name short name for this assembly tables schemas for the tables operated upon within the assembly (dictionary) mounts mount points for stored data (dictionary) bus configuration of the message bus used for coordination between elements (dictionary) elements.sm SM configuration (dictionary) includes source and tiers

URI schemas

mounts[X].baseURI, and elements.sm.tiers[N].store permit URIs; these may presently use the file:// or s3:// URI schemas. Other schemas may be supported in the future.

SM configuration

Configuration options for SM go in the sm entry of elements:

key required purpose value & default
source yes name of bus entry
tiers yes storage tiers list
enforceSchema whether to enforce table schemas when persisting (with performance penalty; for debugging) boolean
false
disableREST whether to disable the REST interface, leaving only q IPC support boolean
false
disableDiscovery whether to disable registration with discovery boolean
false
chunkSize chunk size used for writing tables integer
500000
sortLimitGB memory limit when sorting splayed tables or partitions on disk, in GB integer
10
waitTm time to wait between connection attempts, in milliseconds integer
250
eodPeachLevel level at which EOD peaches to parallelize HDB table processing list:
part
table
in any combination
reloadTimeout maximum time SM waits for client to reload timespan
1 hour

See the deployment example for an example configuration.

Tiers

Tiers describe the locality, segmentation format, and rollover configuration of each storage tier.

A storage tier has the following structure:

key required purpose value & default
name yes
mount yes corresponding mounts entry which determine locality and segmentation format, and also location at which data in the tier may be accessed
store where the tier will physically store data see below
inventory object storage inventory file location see below
schedule policy for when rollovers should be considered see below
retain policy for how much data should be stored in this tier before it is rolled over into the next tier see below
compression policy for compression of data see below
store

URI describing where this tier will physically store data. If not specified, becomes <baseURI>/data of the corresponding mount (enforced, even if specified, for mounts of type local with partition:ordinal). For multiple tiers within the same mount, there can be only one tier without explicitly specified store. If specified explicitly, store must be outside the mount's baseURI.

schedule

If present, this dictionary contains the following keys.

  • freq: HH:MM:SS Used by the ordinal partition mount (IDB) to specify length of interval in each ordinal partition. Default 00:10:00.
  • snap: HH:MM:SS Used by the date partition mount (HDB) to specify when to move data from ordinal to date partition mount. Default 00:00:00.
snap

A snap value of 00:01:00 would allow any late data that arrives in the one minute from 00:00 -> 00:01 belonging to the previous date partition to be saved to that location. Any late data that arrives after 00:01:00 belonging to the previous date partition will be written at the next snap. The data received from 00:00 -> 00:01 belonging to the current date partition will also be saved at this time.

retain

This dictionary may have one or more of the following keys.

  • time: A timespan consisting of a number followed by a unit: {Years,Months,Weeks,Days,Hours,Minutes}, e.g. 2 Years. Data which has been stored for this length of time is rolled over.
  • sizePct: A size as percentage of total storage of corresponding mount, specified as a number from 1 to 100.

If multiple keys are set, they are interpreted in an inclusive-OR fashion.

A mount partitioned as ordinal, or of type stream cannot be used with a storage tier that has a retain policy.

compression

If present, this dictionary contains the following keys.

  • algorithm: Compression algorithm: {none, qipc, gzip, snappy, lz4hc}
  • block: Block size
  • level: Compression level

The compression policy currently applies only to tiers associated with a mount of type:local and partition:date.

inventory

If present, this dictionary contains the following keys.

  • enabled: true or false to enable inventory files. If true must provide location (default false)
  • location: Location relative to the root of the bucket/storage that the inventory will be written to.

Inventory only applies when using a store that is an object storage URI.

An example configuration, which will produce s3://kxi-example-data/inventory/inventory.tgz is:

    name: hdb-s3
    mount: hdb
    store: s3://kxi-example-data/db
    inventory:
      enabled: true
      location: inventory/test-db-inventory.tgz

Tiers can be categorized according to their locality and segmentation format, which imply the characteristics and governing rules:

Stream based tier

Stream based tier represents the in-memory data that is received between write-down events. It is implicit and need not be specified.

Local-ordinal based tier

There has to always be one tier that corresponds to mount of type local with partition ordinal. However its configuration can be omitted, in which case the frequency defaults to 10 minutes.

Local-date based tier

There can be one or more tiers that correspond to mount of type local with partition date. However when only one tier is used, its configuration can be omitted in which case snap-time defaults to midnight, frequency to 1 day, and retain to infinite.

Using Reliable Transport

Register Storage Manager (SM) with a Reliable Transport (RT) compatible message bus to receive the table updates and publish the _prtnEnd and _reload signals.

See the deployment example for the configuration, schema, and code to use a tickerplant.

Object Storage Inventory files

The Storage Manager can write inventory files at end of day, or produce them on startup if none exist. The inventory files will be used to speed up subsequent reload times for the Storage Manager and Data Access processes.

To configure the SM to produce these files, set inventory along with store under the tier configuration. See the tiers section above for layout information.

The DA may be configured to set KX_OBJSTR_INVENTORY_FILE to the inventory path, relative to the root of the bucket.

A full configuration of the DA and the SM would look like:

    sm:
      tiers:
        - name: streaming
          mount: rdb
        - name: interval
          mount: idb
          schedule:
            freq: 01:00:00
            snap: 00:00:00
        - name: recent
          mount: hdb
          schedule:
            freq: 1D00:00:00
            snap:   00:00:00
          retain:
            time: 7 Days
        - name: s3
          mount: hb
          store: s3://kxi-sm-example/db
          inventory:
            enabled: true
            location: inventory/inventory.tgz
    dap:
      instances:
        da:
          env:
            - name: KX_OBJSTR_INVENTORY_FILE
              value: "inventory/inventory.tgz"