Querying Object Storage

The KX Insights Platform can query historical data that has been migrated to cloud storage.

Querying storage is made possible by applying an assembly with an HDB set to have object as its mount type.

For an example on writing persisting new data to object storage, see writing to object storage.

For an example on migrating data to object storage, see the quick start on object storage.

Prerequisites

To query object storage, the following files are required:

Existing database that has been uploaded to cloud storage
sym file for the database

You will need to configure environment variables for object storage.

You will need to provide credentials or utilize service accounts to access your private buckets.

Authentication

For more information on service accounts see automatic registration and environment variables.

To configure environment variables in an assembly, you may include them underneath the dap.instances.*.env components as a list.

For example, to set trace logging for objstor and an AWS_REGION:

    dap:
      instances:
        idb:
          mountName: idb
        hdb:
          env:
            - name: AWS_REGION
              value: us-east-2
            - name: KX_TRACE_S3
              value: "1"
            ...

Deployment

To query data in cloud storage you must deploy an assembly which will configure databases with an object type mount.

kubectl apply -f mount-storage.yml

When you specify an object mount, the baseURI indicates the location that the object storage cache and local files can reside in.

Under the hdb instance section, you may specify URIs for the databases sym and par.txt.

key	description	file URI	object URI
sym	Path to a sym file	Y	Y
par	Path to par.txt	Y	Y
storageURI	Path to database in cloud storage	N	Y

Valid combinations are:

sym + par
storageURI + sym

Specify these keys under spec.elements.dap.instances.hdb:

spec:
  elements:
    dap:
      instances:
        hdb:
            storageURI: s3://example/db
            sym: file:///opt/kx/cfg/sym
  mounts:
    hdb:
      type: object
      partition: none
      baseURI: file://data/hdb

Examples

These examples omit the section of the assembly where you set the vendor specific environment variables. These examples aim to demonstrate how the mounts sections work.

Deploying with a storageURI and sym

Depending on whether the sym file is stored in object storage or locally available to the pod you will need to declare this in your config file.

Object Storage Local Storage

spec:
  elements:
    dap:
      instances:
        hdb:
          storageURI: s3://example/db
          sym: s3://example/sym
  mounts:
    hdb:
      type: object
      partition: none
      baseURI: file://data/hdb

spec:
  volumes:
    - name: my-volume
  elements:
    dap:
      instance:
        hdb:
          mountName: hdb
          storageURI: s3://example/db
          sym: file:///opt/kx/objectcfg/sym
          volumeMounts:
            - name: my-volume
              mountPath: /opt/kx/objectcfg
  mounts:
    hdb:
      type: object
      partition: none
      baseURI: file://data/hdb

Deploying with a sym and par.txt

Depending on whether the sym file and par.txt is stored in object storage or locally available to the pod you will need to declare this in your config file.

Object Storage Local Storage

par.txt should be a newline separated list of cloud storage databases

spec:
  elements:
    dap:
      instances:
        hdb:
          sym: s3://example/sym
          par: s3://example/par.txt
  mounts:
    hdb:
      type: object
      partition: none
      baseURI: file://data/hdb

spec:
  volumes:
    - name: my-volume
  elements:
    dap:
      instances:
        hdb:
        mountName: hdb
        sym: file:///opt/kx/objectcfg/sym
        par: file:///opt/kx/objectcfg/par.txt
        volumeMounts:
            - name: my-volume
            mountPath: /opt/kx/objectcfg
  mounts:
    hdb:
      type: object
      partition: none
      baseURI: file://data/hdb

Querying the data

Once your assembly has been deployed, object storage data can be queried just like any other historical data within the KX Insights Platform. Your data will be a part of a segmented database, and queries that reach the HDB will go across object storage and disk.

For a list of features see Querying with API.