Docker metrics deployment
This page provides an example of a Docker deployment using the kxi-sidecar
to provide Prometheus metric capture. This example builds on the basic Docker example.
Prerequisites
Pulling the images requires a login with:
docker login registry.dl.kx.com -u <username> -p <password>
Directory structure
Before running this example, the following directory structure should be created:
db/ # Empty directory where the database will be stored on disk
cfg/ # Directory for configuration files
assembly.yaml
rt_tick_client_lib.q
da/
config.json
rdb-config.json
idb-config.json
hdb-config.json
sm/
config.json
kdb-tick/ # Clone of kdb-tick for a tickerplant service
logs/
tick/{r.q, u.q, sym.q}
tick.q
.env
docker-compose-da.yaml
docker-compose-sm.yaml
docker-compose-tp.yaml
Write permissions
The db/
and kdb-tick/logs
directory on the host must allow write permission by the SM and TP containers who run as the user "nobody
".
- The sources for
agg.q
andda.q
for this example are provided here. kdb-tick/
is cloned from the KX Github.- The
sym.q
schema must include the_prtnEnd
and_reload
tables, as in:
// internal tables
// with `time` and `sym` columns added by RT client for compatibility
(`$"_prtnEnd")set ([] time:"n"$(); sym:`$(); signal:`$(); endTS:"p"$(); opts:());
(`$"_reload")set ([] time:"n"$(); sym:`$(); mount:`$(); params:(); asm:`$())
(`$"_heartbeats")set ([] time:"n"$(); sym:`$(); foo:"j"$())
trade:([] time:"n"$(); sym:`$(); realTime:"p"$(); price:"f"$(); size:"j"$())
quote:([] time:"n"$(); sym:`$(); realTime:"p"$();
bid:"f"$(); ask:"f"$(); bidSize:"j"$(); askSize:"j"$())
Run
To run, execute the following:
docker-compose \
-f docker-compose-tp.yaml \
-f docker-compose-da.yaml \
-f docker-compose-sm.yaml \
-f docker-compose-sg.yaml \
up
Environment
Each of the Docker Compose files below use a .env
file specifying the images and licenses to be used. Below, the RELEASE
, QCE_RELEASE
and SIDECAR_RELEASE
environment variables are configured to point to the latest releases of kdb Insights Microservices and kdb Insights respectively. Additionally a license must be provided to run this example.
# Images
kxi_sg_gw=$REGISTRY/kxi-sg-gw:$RELEASE
kxi_sg_rc=$REGISTRY/kxi-sg-rc:$RELEASE
kxi_sg_agg=$REGISTRY/kxi-sg-agg:$RELEASE
kxi_sm_single=$REGISTRY/kxi-sm-single:$RELEASE
kxi_da_single=$REGISTRY/kxi-da-single:$RELEASE
kxi_sidecar=$REGISTRY/kxi-sidecar:$SIDECAR_RELEASE
kxi_q=$REGISTRY/qce:$QCE_RELEASE
# Paths
local_dir="."
mnt_dir="/mnt"
shared_dir="/mnt/shared"
cfg_dir="/mnt/cfg"
db_dir="/mnt/data/db"
logs_dir="/mnt/data/logs"
Assembly file
The Assembly file is the main business configuration for the database. Table schemas and logical process configuration are defined here. See assembly configuration for more information.
name: fin-example
description: Data access assembly configuration
labels:
region: New York
assetClass: stocks
tables:
_heartbeats:
type: basic
columns:
- name: time
type: timespan
- name: sym
type: symbol
- name: foo
type: float
trade:
description: Trade data
type: partitioned
blockSize: 10000
prtnCol: realTime
sortColsOrd: sym
sortColsDisk: sym
columns:
- name: time
description: Time
type: timespan
- name: sym
description: Symbol name
type: symbol
attrMem: grouped
attrDisk: parted
attrOrd: parted
- name: realTime
description: Real timestamp
type: timestamp
- name: price
description: Trade price
type: float
- name: size
description: Trade size
type: long
quote:
description: Quote data
type: partitioned
blockSize: 10000
prtnCol: realTime
sortColsOrd: sym
sortColsDisk: sym
columns:
- name: time
description: Time
type: timespan
- name: sym
description: Symbol name
type: symbol
attrMem: grouped
attrDisk: parted
attrOrd: parted
- name: realTime
description: Real timestamp
type: timestamp
- name: bid
description: Bid price
type: float
- name: ask
description: Ask price
type: float
- name: bidSize
description: Bid size
type: long
- name: askSize
description: Ask size
type: long
bus:
stream:
protocol: custom
nodes: tp:5010
topic: dataStream
mounts:
rdb:
type: stream
baseURI: file://stream
partition: none
idb:
type: local
baseURI: file:///mnt/data/db/idb
partition: ordinal
hdb:
type: local
baseURI: file:///mnt/data/db/hdb
partition: date
elements:
dap:
gwAssembly: gw-assembly
smEndpoints: sm:20001
instances:
dap:
mountList: [rdb, idb, hdb]
sm:
description: Storage manager
source: stream
tiers:
- name: stream
mount: rdb
- name: idb
mount: idb
schedule:
freq: 0D00:10:00 # every 10 minutes
- name: hdb
mount: hdb
schedule:
freq: 1D00:00:00 # every day
disableDiscovery: true # Disables registering with discovery
Docker compose
Each service of the database is deployed here as a separate Docker Compose file for clarity. Each of these could be combined into a single Docker Compose file instead.
Service Gateway
The Service Gateway configures the three containers that make up the gateway.
networks:
kx:
name: kx
external: true
services:
sgrc:
image: ${kxi_sg_rc}
environment:
- KXI_NAME=sg_rc
- KXI_PORT=5050
- KXI_LOG_FORMAT=text
- KXI_LOG_LEVELS=default:info
- KDB_LICENSE_B64
networks: [kx]
volumes:
- ${local_dir}:${mnt_dir}
sgagg:
image: ${kxi_sg_agg}
environment:
- KXI_NAME=sg_agg
- KXI_PORT=5060
- KXI_SG_RC_ADDR=sgrc:5050
- KXI_LOG_FORMAT=text
- KXI_LOG_LEVELS=default:info
- KDB_LICENSE_B64
deploy: # Optional: deploy multiple replicas.
mode: replicated
replicas: 1
networks: [kx]
volumes:
- ${local_dir}:${mnt_dir}
sggw:
image: ${kxi_sg_gw}
environment:
- GATEWAY_QIPC_PORT=5040
- GATEWAY_HTTP_PORT=8080
- KXI_SG_RC_ADDR=sgrc:5050
- KXI_LOG_FORMAT=text
- KXI_LOG_LEVELS=default:info
networks: [kx]
deploy: # Optional: deploy multiple replicas.
mode: replicated
replicas: 1
volumes:
- ${local_dir}:${mnt_dir}
Data Access Processes
A set of Data Access Processes are configured, each of which connect to the Resource Coordinator of the Service Gateway launched above.
networks:
kx:
name: kx
services:
dap:
image: ${kxi_da_single}
command: -p 5080
environment:
- KXI_NAME=dap
- KXI_SC=dap
- KXI_PORT=5080
- KXI_RT_LIB=${shared_dir}/rt_tick_client_lib.q
- KXI_ASSEMBLY_FILE=${cfg_dir}/assembly.yaml
- KXI_SG_RC_ADDR=sgrc:5050
- KXI_CONFIG_FILE=${cfg_dir}/da/config.json
- KXI_LOG_CONFIG=${cfg_dir}/qlog.json
- KDB_LICENSE_B64
volumes:
- ${local_dir}:${mnt_dir}
networks: [kx]
rdb-sidecar:
image: ${kxi_sidecar}
environment:
- KXI_CONFIG_FILE=${cfg_dir}/da/rdb-config.json
- KXI_LOG_CONFIG=${cfg_dir}/qlog.json
- KDB_LICENSE_B64
volumes:
- ${local_dir}:${mnt_dir}
command: -p 8080
networks: [kx]
idb-sidecar:
image: ${kxi_sidecar}
environment:
- KXI_CONFIG_FILE=${cfg_dir}/da/idb-config.json
- KXI_LOG_CONFIG=${cfg_dir}/qlog.json
- KDB_LICENSE_B64
volumes:
- ${local_dir}:${mnt_dir}
command: -p 8080
networks: [kx]
hdb-sidecar:
image: ${kxi_sidecar}
environment:
- KXI_CONFIG_FILE=${cfg_dir}/da/hdb-config.json
- KXI_LOG_CONFIG=${cfg_dir}/qlog.json
- KDB_LICENSE_B64
volumes:
- ${local_dir}:${mnt_dir}
command: -p 8080
networks: [kx]
Storage Manager
The Storage Manager is configured as a single container, allowing connections by Data Access Processes configured above.
networks:
kx:
name: kx
services:
sm:
image: ${kxi_sm_single}
command: -p 20001
environment:
- KXI_NAME=sm
- KXI_SC=SM
- KXI_ASSEMBLY_FILE=${cfg_dir}/assembly.yaml
- KXI_RT_LIB=${shared_dir}/rt_tick_client_lib.q
- KXI_CONFIG_FILE=${cfg_dir}/sm/config.json
- KXI_LOG_CONFIG=${cfg_dir}/qlog.json
- KDB_LICENSE_B64
volumes:
- ${local_dir}:${mnt_dir}
networks: [kx]
sm-sidecar:
image: ${kxi_sidecar}
environment:
- KXI_CONFIG_FILE=${cfg_dir}/sm/config.json
- KXI_LOG_CONFIG=${cfg_dir}/qlog.json
- KDB_LICENSE_B64
volumes:
- ${local_dir}:${mnt_dir}
command: -p 8080
networks: [kx]
Tickerplant
A standard tickerplant is put in front of the Storage Manager and Data Access Process to provide durable data ingestion.
Note
Within kdb Insights Enterprise, the transport used is kdb Insights Reliable Transport rather than a tickerplant. This allows for fault-tolerance and durable messaging despite potentially unreliable network connections. When using a standard tickerplant instead, the interface must adhere to the API expected by RT.
networks:
kx:
name: kx
services:
tp:
image: ${kxi_q}
command: tick.q sym ${logs_dir} -p 5010
working_dir: ${shared_dir}/kdb-tick
environment:
- KDB_LICENSE_B64
volumes:
- ${local_dir}:${mnt_dir}
networks: [kx]
Metrics configuration
Metrics configuration is stored in a config.json
file for each of the respective processes. This configuration tells the sidecar where to scrape metrics from and what metrics to include. See the configurable metrics table for a list of available metrics.
Data Access Processes
```{.json title="cfg/da/config.json"} { "connection": ":dap:5080", "frequencySecs": 5, "metrics": { "enabled":"true", "frequency": 5, "handler": { "pc": true, "pg": true, "ph": true, "po": true, "pp": true, "ps": true, "ts": true, "wc": true, "wo": true, "ws": true } }
}
```
```{.json title="cfg/da/rdb-config.json"} { "connection": ":dap:5081", "frequencySecs": 5, "metrics": { "enabled":"true", "frequency": 5, "handler": { "pc": true, "pg": true, "ph": true, "po": true, "pp": true, "ps": true, "ts": true, "wc": true, "wo": true, "ws": true } }
}
```
```{.json title="cfg/da/idb-config.json"} { "connection": ":dap:5082", "frequencySecs": 5, "metrics": { "enabled":"true", "frequency": 5, "handler": { "pc": true, "pg": true, "ph": true, "po": true, "pp": true, "ps": true, "ts": true, "wc": true, "wo": true, "ws": true } }
}
```
```{.json title="cfg/da/hdb-config.json"} { "connection": ":dap:5083", "frequencySecs": 5, "metrics": { "enabled":"true", "frequency": 5, "handler": { "pc": true, "pg": true, "ph": true, "po": true, "pp": true, "ps": true, "ts": true, "wc": true, "wo": true, "ws": true } }
}
```
Storage Manager
{
"connection": ":sm:20001",
"frequencySecs": 5,
"metrics":
{
"enabled":"true",
"frequency": 5,
"handler": {
"pc": true,
"pg": true,
"ph": true,
"po": true,
"pp": true,
"ps": true,
"ts": true,
"wc": true,
"wo": true,
"ws": true
}
}
}
Prometheus configuration
An additional Prometheus client has can be launched using the provided prometheus.yaml
to scrape the metrics from the sidecars.
global:
scrape_interval: 15s # By default, scrape targets every 15 seconds.
evaluation_interval: 15s # Evaluate rules every 15 seconds.
scrape_configs:
- job_name: 'monitoring'
static_configs:
- targets: ['dap-sidecar:8080','rdb-sidecar:8080','idb-sidecar:8080','hdb-sidecar:8080','sm-sidecar:8080']
RT client library
To connect to the tickerplant, both Data Access Processes and the Storage Manager are configured with a custom script defining connection details and providing the necessary .rt.*
APIs expected by the services.
// === internal tables without time/sym columns ===
.rt.NO_TIME_SYM:`$("_prtnEnd";"_reload";"_batchIngest")
.rt.IS_DICT:`$enlist"_batchIngest"
// === rt publish and push functions ===
.rt.push:{'"cannot push unless you have called .rt.pub first"}; // will be overridden
.rt.pub:{[topic]
if[not 10h=type topic;'"topic must be a string"];
h:neg hopen hsym`$getenv `KXI_RT_NODES;
.rt.push:{[nph;payload]
x:$[98h=type x:last payload; value flip x;99h=type x;enlist each value x;x];
if[(t:first payload)in .rt.NO_TIME_SYM; x:(count[first x]#'(0Nn;`)),x];
nph(`.u.upd;t;x);}[h;];
.rt.push }
// === rt update and subscribe ===
if[`upd in key `.; '"do not define upd: rt+tick will implement this"];
if[`end in key `.u; '"do not define .u.end: rt+tick will implement this"];
if[not type key`.rt.upd; .rt.upd:{[payload;idx] '"need to implement .rt.upd"}];
.rt.sub:{[topic;startIdx;uf]
if[not 10h=type topic;'"topic must be a string"];
//connect to the tickerplant
h:hopen hsym`$getenv `KXI_RT_NODES;
//initialise our message counter
.rt.idx:0;
// === tick.q will call back to these ===
upd::{[uf;t;x]
if[not type x; x:flip(cols .rt.schema t)!x]; // for log replay
if[t in .rt.NO_TIME_SYM; x:`time`sym _x];
if[t in .rt.IS_DICT; x:first x];
uf[(t;x);.rt.idx];
.rt.idx+:1; }[uf];
.u.end:{.rt.idx:.rt.date2startIdx x+1};
//replay log file and continue the live subscription
if[null startIdx;startIdx:0W]; // null means follow only, not start from beginning
//subscribe
res:h "(.u.sub[`;`]; .u `i`L; .u.d)";
.rt.schema:(!/)flip res 0; // used to convert arrays to tables during log replay
//if start index is less than current index, then recover
if[startIdx<.rt.idx:(.rt.date2startIdx res 2)+res[1;0];
.rt.recoverMultiDay[res[1];startIdx]]; }
//100 billion records per day
.rt.MAX_LOG_SZ:"j"$1e11;
.rt.date2startIdx:{("J"$(string x) except ".")*.rt.MAX_LOG_SZ};
.rt.recoverMultiDay:{[iL;startIdx]
//iL - index and Log (as can be fed into -11!)
i:first iL; L:last iL;
//get all files in the same folder as the tp log file
files:key dir:first pf:` vs last L;
//get the name of the logfile itself
fileName:last pf;
//get all the lognameXXXX.XX.XX files (logname is sym by default - so usually the files are of the form sym2021.01.01, sym2021.01.02, sym2021.01.03, etc)
files:files where files like (-10_ string fileName),"*";
//from those files, get those with dates in the range we are interested in
files:` sv/: dir,/:asc files where ("J"$(-10#/:string files) except\: ".")>=startIdx div .rt.MAX_LOG_SZ;
//set up upd to skip the first part of the file and revert to regular definition when you hit start index
upd::{[startIdx;updo;t;x] $[.rt.idx>=startIdx; [upd::updo; upd[t;x]]; .rt.idx+:1]}[startIdx;upd];
//read all of all the log files except the last, where you read up to 'i'
files:0W,/:files; files[(count files)-1;0]:i;
//reset .rt.idx for each new day and replay the log file
{.rt.idx:.rt.date2startIdx "D"$-10#string x 1; -11!x}each files;
};
//100 billion updates per day - 1e11
//30210610*1e11