Skip to content

Examples

This page provides example code snippets to help you use kdb+ and cloud storage.

As a prerequisite, you must have your cloud vendor CLI installed to run some of the commands.

Creating data on Cloud Storage

In order for data to be migrated to the cloud, it must first be staged locally on a POSIX filesystem. This is because KX Insights Core does not support writing to cloud storage using the traditional set and other write functions.

To migrate the below database to a cloud storage account using a sample database created in kdb+:

q)d:2021.09.01+til 20
q){[d;n]sv[`;.Q.par[`:test/db/;d;`trade],`]set .Q.en[`:test/;([]sym:`$'n?.Q.A;time:.z.P+til n;price:n?100f;size:n?50)];}[;10000]each d

This creates the following structure:

test/.
├── db
│   ├── 2021.09.01
│   ├── 2021.09.02
│   ├── 2021.09.03
│   ├── 2021.09.04
│   ├── 2021.09.05
│   ├── 2021.09.06
│   ├── 2021.09.07
│   ├── 2021.09.08
│   ├── 2021.09.09
│   ├── 2021.09.10
│   ├── 2021.09.11
│   ├── 2021.09.12
│   ├── 2021.09.13
│   ├── 2021.09.14
│   ├── 2021.09.15
│   ├── 2021.09.16
│   ├── 2021.09.17
│   ├── 2021.09.18
│   ├── 2021.09.19
│   └── 2021.09.20
└── sym

The below functions can be used to create a storage account and copy the database to the newly created storage account

Documentation provided here

For example

## create bucket
aws s3 mb s3://mybucket --region us-west-1

## copy database to bucket
aws s3 cp test/* s3://mybucket/ --recursive

Documentation provided here

For example

## create bucket
az group create --name <resource-group> --location <location>
az storage account create \
    --name <storage-account> \
    --resource-group <resource-group> \
    --location <location> \
    --sku Standard_ZRS \
    --encryption-services blob

az ad signed-in-user show --query objectId -o tsv | az role assignment create \
    --role "Storage Blob Data Contributor" \
    --assignee @- \
    --scope "/subscriptions/<subscription>/resourceGroups/<resource-group>/providers/Microsoft.Storage/storageAccounts/<storage-account>"

az storage container create \
    --account-name <storage-account> \
    --name <container> \
    --auth-mode login

## copy database to bucket
az storage blob upload \
    --account-name <storage-account> \
    --container-name <container> \
    --name helloworld \
    --file helloworld \
    --auth-mode login

Documentation provided here

For example

## create bucket
gsutil mb -p PROJECT_ID -c STORAGE_CLASS -l BUCKET_LOCATION -b on gs://BUCKET_NAME

## copy database to bucket
gsutil cp -r OBJECT_LOCATION gs://DESTINATION_BUCKET_NAME/

Deleting data from Cloud Storage

Deleting data on Cloud Storage should be a rare occurrence but in the event that such a change is needed, the below steps should be followed

  • Offline any hdb reader processes that are currently using the storage account

  • Remove any caches created by the kxreaper application

  • Delete the data from the storage account using

  • Recreate the inventory file(if used)

  • Online the reader processes making sure they are reloaded to pick up the new inventory file and drop any metadata caches using drop command

Changing data on Cloud Storage

Altering data e.g. changing types, adding columns etc will require the same steps as deleting data. Once the reader processes have been taken offline, the changes will be able to happen safely bearing in mind that in order to change data, it will first need to be copied from the storage account, amended and the copied back to the appropriate path using a cloud cli copy command.

Creating inventory JSON

Refer to the inventory file documentation for instructions on how to create and use the inventory file.

Combining Cloud and Local Storage in a single HDB

The addition of the object store library allows clients to extend their tiering strategies to cloud storage. In some instances, it is necessary to query data that has some partitions on a local POSIX filesystem and other partitions on cloud storage. To give a kdb+ process access to both datasets, the par.txt can be set as follows:

s3://mybucket/db
/path/to/local/partitions

Note: if multiple storage accounts are added they must be in the same AWS region.

ms://mybucket/db
/path/to/local/partitions
gs://mybucket/db
/path/to/local/partitions

Note that multiple local filesystems and storage accounts can be added to par.txt.