Running RT in docker-compose

Introduction

This section and the accompanying docker-compose.yaml and env files provide a guide to deploying and running RT by bringing up:

An RT cluster on a single docker host.
A sample q publisher.
A sample q subscriber.

It also illustrates the required:

Volumes
Networking
Startup arguments
Environment variables

For more information on how RT works see here.

A useful tool for inspecting and navigating the docker compose cluster is Lazydocker which provides a curses style GUI similar to k9s.

Images

In order to be able to pull down the relevant images, please see here.

Provide a license

A license for kdb+ Cloud Edition is required and is provided through the environment variable KDB_LICENSE_B64. It can be generated from a valid kc.lic file with base64 encoding. In a *nix based system, we can create the environment variable with the following command.

export KDB_LICENSE_B64=$(base64 -w 0 path-to/kc.lic)

The kc.lic used must be for kdb+ Cloud Edition. A regular kc.lic for On-Demand kdb+ will signal a licensing error during startup.

Bring up docker-compose

Download docker-compose.yaml and env then bring up the RT docker-compose example:

docker compose --env-file env -f docker-compose.yaml up

Limitations

Because all three containers are running on the same docker host, this is not an HA solution - if that host goes down then RT also goes down.

Making this HA would require an application specific orchestration mechanism to be employed with support for:

Running each sequencer on a different host (anti-affinity).
Mounting a volume for each sequencer (stateful set).
Migrating a sequencer (and its volumes) to a different host in the event of a hardware failure.

This section does not address or advise on orchestration, but rather provides a template of how to deploy and run RT using an orchestration layer of the user's choice.

Steps to bring up docker-compose with support for external SDKs

The RT external SDKs (c and Java) were designed to connect to the kdb Insights Enterprise via an Information Service which will provide the RT external endpoints and associated SSL ca/cert/key for a client which has already been enrolled with Keycloak. Deploying these services is outside the scope of this document but here their role can be mocked and the process demonstrated.

It is necessary to perform some addition steps when bringing up the docker compose to support these external SDKs:

Run the make_certs.sh script which will generate client and server ca/cert/key in the certs/ subdirectory. The certs/server directory is mounted into the RT nodes in the docker-compose where the server ca/cert/key is used to start the external replicators:
```
sh make_certs.sh
```
Download docker-compose.yaml and env then bring up the RT docker-compose example:
```
docker compose --env-file env -f docker-compose.yaml up
```

On another terminal run the enrol_json.sh script which uses docker to look up the port mappings on the docker host for the external replicators, and reads the client ca/cert/key from certs/client:

sh enrol_json.sh

It uses this information to generate a client.json which conforms to the same structure as would be returned by curl-ing the information service:

cat client.json | jq .
{
  "name": "client-name",
  "topics": {
    "insert": "data",
    "query": "requests"
  },
  "ca": "<ca>",
  "cert": "<cert>",
  "key": "<key>",
  "insert": {
    "insert": [
      ":127.0.0.1:5000",
      ":127.0.0.2:5000",
      ":127.0.0.3:5000"
    ],
    "query": []
  },
  "query": []
}

The external SDK can now be started by pointing it to this file rather than the information service endpoint.

Java

RT_REP_DIR=<REPLICATOR_LOCATION>
RT_LOG_PATH=<RT_LOG_PATH>
KXI_CONFIG_FILE=./client.json
java -jar ./rtdemo-<VERSION>-all.jar --runCsvLoadDemo=<CSV_FILE>

C

DSN="DRIVER=/usr/local/lib/kodbc/libkodbc.so;CONFIG_FILE=./client.json"
Schema="sensorID:int,captureTS:ts,readTS:ts,valFloat:float,qual:byte,alarm:byte"
Table="trace"
./csvupload -c "$DSN" -t "$Table" -s "$Schema" < sample.csv

Networking

The docker-compose uses a bridged network for simplicity. If the networking layer used by another orchestration requires ports to be explicitly exposed then the following ports are required:

Intra-RT ports

These are used by the RT sequencers to communicate with each other:

port	type	notes
4000	TCP	Sequencer
5009	TCP	Xsync push server replicator
7000	UDP	Raft
7100	TCP	Raft
8000	UDP	Sequencer
9000	UDP	Watcher

Internal RT ports

These are used by internal publishers and subscribers which typically run inside the cluster and do not require SSL:

port	type	notes
5001	TCP	Internal pull server replicator
5002	TCP	Internal push server replicator

External RT ports

This is used by external publishers which typically run outside the cluster and therefore require client enrollment and SSL:

port	type	notes
5000	TCP	External push server replicator with SSL

Admin ports

These are used by the user to interact with the RT services:

port	type	notes
6000	HTTP	RT REST service (for hard reset, diagnostics, etc.)

RT Sequencer

Each RT sequencer must run as a separate service (they are not replicas) with its own persistent volume. If the orchestration supports moving the RT sequencer to a different node in the event of a failure then its volume must move with it.

In order to support RT hard reset, each RT sequencer must also have access to a shared volume to store the session number (the stream position to resume from after the reset). Alternatively where kubernetes is being used for orchestration, RT can manage the session number with a configmap rather than this shared volume.

Startup arguments

The RT sequencer arguments are used to configure its directories, the RAFT cluster size and archival policy:

-size Populates the RT_REPLICAS environment variable, which is the ordinality of RT sequencers, i.e. RAFT cluster size. Currently 1 and 3 are supported.
-in_dir The directory in the persistent volume where publishers' input log are stored (one subdirectory per publisher).
-out_dir The directory in the persistent volumes where the merged output log is stored.
-state_dir The directory in the persistent volume where the RAFT logs are stored.
-limit See maxLogSize.
-time See retentionDuration.
-disk See maxDiskUsagePercent.

Environment variables

RT_TOPIC_PREFIX The prefix used to locate other nodes in the RT cluster via hostnames or DNS, e.g. rt-. Intra-RT connections are made using ${RT_TOPIC_PREFIX}${RT_SINK}-[012]
RT_SINK The RT identifier used to locate other nodes in the RT cluster via hostname or DNS, e.g. data. Intra-RT connections are made using ${RT_TOPIC_PREFIX}${RT_SINK}-[012]
RT_SEQ_SESSION_PATH Absolute path to a directory in the shared volume for storing the hard reset session number. If not set, RT will instead store the session in a kubernetes config map requiring a service account with sufficient access to the kubernetes control plane to be configured on the pod.
RT_LOGLEVEL_CONSOLE Sets the RT logging level. Can be one of ERROR, INFO, DEBUG, TRACE. Default INFO.
RAFT_HEARTBEAT, RAFT_LOG_SIZE, RAFT_CHUNK_SIZE. See RAFT configuration.
RT_QURAFT_LOG_LEVEL Sets the QuRaft logging level. Can be one of ERROR, INFO, DEBUG, TRACE, OFF. Default OFF.
RT_LOG_LEADER Set this env var to a non-empty string to enable QuRaft periodically log which node is the leader. Default "", i.e. off.

Publisher

Each publisher must have its own persistent volume.

q publishers are supported using the .rt.pub[streamid] API in rt.qpk here.

For an example publisher please see the sample publisher which sends a table of sensor data every second.

Subscriber

Subscribers can either use persistent volumes or ephemeral storage but it's recommended that persistent volumes are used for performance reasons.

For an example q subscriber please see the sample subscriber.

q subscribers are supported using the .rt.sub[streamid; pos; callbacks] API in rt.qpk: here.

How to connect a q publisher/subscriber to RT from outside docker compose

With port forwarding and a local DNS server it is possible to run a q publisher or subscriber (using the rt.qpk) on your local host which connects to the RT running in docker-compose.

For example:

Port forward the internal push_server and pull_server from each RT sequencer to the docker host, using a different local IP address for each of the RT nodes:
```
services:
  rt-data-0:
    ports:
    - 127.0.0.1:5001:5001/tcp
    - 127.0.0.1:5002:5002/tcp
  rt-data-1:
    ports:
    - 127.0.0.2:5001:5001/tcp
    - 127.0.0.2:5002:5002/tcp
  rt-data-2:
    ports:
    - 127.0.0.3:5001:5001/tcp
    - 127.0.0.3:5002:5002/tcp
```
rt.qpk port settings

Unlike the external SDKs where the RT endpoints can use any port as presented in the json, the rt.qpk requires a publisher to connect to port 5002 on each RT node and a subscriber to connect to port 5001 on each RT node.
Configure your local DNS server (/etc/hosts) to map each of RT nodes to these IP address
```
127.0.0.1 rt-data-0
127.0.0.2 rt-data-1
127.0.0.3 rt-data-2
```
Run the publisher/subscriber locally as normal with $RT_TOPIC_PREFIX="rt-" and <streamid>="data"

Running a one node RT

It is also possible to run RT in a one node configuration. The instructions below illustrate the differences between deploying a one node RT compared to a three node RT and should be read in conjunction with the more detailed instructions above for using RT with docker-compose.

Change the size environment variable in the env file to 1 as follows:

#/bin/bash

# RT ordinality (only 1 and 3 are supported ATM):
# size=1 - use with docker-compose-one-node.yaml
# size=3 - use with docker-compose.yaml
export size=1

Download docker-compose-one-node.yaml and bring it up with:

docker compose --env-file env -f docker-compose-one-node.yaml up

If you wish to use external clients you should download and run the enrol_json_one_node.sh script which generates a client.json containing the single RT endpoint:

cat client.json | jq .
{
  "name": "client-name",
  "topics": {
    "insert": "data",
    "query": "requests"
  },
  "ca": "<ca>",
  "cert": "<cert>",
  "key": "<key>",
  "insert": {
    "insert": [
      ":127.0.0.1:5000"
    ],
    "query": []
  },
  "query": []
}

Connecting to a one node RT from outside the docker-compose requires:

Setting up the port forwards as follows:

services:
rt-data-0:
    ports:
    - 127.0.0.1:5001:5001/tcp
    - 127.0.0.1:5002:5002/tcp

Configure your /etc/hosts with the following:

127.0.0.1 rt-data-0