Quickstart (Docker)

End-to-end workflow examples using the kdb Insights Machine Learning functionality within a Docker deployment

This quickstart introduces the use of this component as part of real-world applications. It is not fully descriptive: explore the API documentation here which outlines all functionality provided to a user by this docker image.

The following docker based workflows are supported for the kxi-ml functionality:

Use the kxi-ml docker image as a development environment for machine learning.
Deploy an kxi-ml docker image as a Worker in a Stream Processor workflow.

Image description

The docker image kxi-ml operates with the following base installed items.

- Base Image: rockylinux/rockylinux:8
- Python 3.9.6
- Python modules:
        - numpy
        - scipy
        - scikit-learn
        - statsmodels
        - matplotlib
        - pandas

Development Environment

Pull the kxi-ml image required to run the kdb Insights ML functionality
- Login to the docker registry
```
docker login registry.dl.kx.com -u <insert> -p <insert>
```
- Pull the kxi-ml image
```
docker pull registry.dl.kx.com/kxi-ml
```
Base encode your kdb+ license and store as an environment variable
```
$ export KDB_LICENSE_B64=$(base64 path-to/kc.lic)
```

Start the docker image

$ docker run -it -p 5000:5000 \
  -e "KDB_LICENSE_B64=$KDB_LICENSE_B64" \
  registry.dl.kx.com/kxi-ml

Extensibility

The startup outlined in step 3 above can be modified in a number of ways to allow additional flexibility. For example users following the instructions outlined here can use this image to deploy models to an ML Registry within cloud storage as follows for AWS:

$ docker run -it -p 5000:5000 \
    -e "KDB_LICENSE_B64=$KDB_LICENSE_B64" \
    -e "AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID" \
    -e "AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY" \
    -e "AWS_REGION=$AWS_REGION" \
    registry.dl.kx.com/kxi-ml \
    -aws s3://my-aws-storage -p 5000
q).ml.registry.set.model[::;::;{x};"mymodel";"q";::]

Deploying ML models to the Stream Processor

To ensure that this workflow is clear for the Machine Learning use-case the following instructions are provided, this follows closely the Stream Processor quickstart described here with a focus on the deployment of a specification requiring access to Machine Learning Operators defined here explicitly.

In each of the examples below we will make use of the kxi-ml docker image available on the registry.dl.kx.com docker registry. Additionally we will need access to the kxi-sp-controller, download of each can be achieved as follows:

Login to the docker registry

docker login registry.dl.kx.com -u <insert> -p <insert>

Pull the kxi-ml image
```
docker pull registry.dl.kx.com/kxi-ml
```

Pull the kxi-sp-controller image

docker pull registry.dl.kx.com/kxi-sp-controller

Pipelines

q Pipeline:

Define a pipeline specification spec.q which uses some SP ML Operators:

// spec.q

.qsp.run
  .qsp.read.fromCallback[`publish]
  .qsp.transform.replaceInfinity[::]
  .qsp.ml.registry.fit[
    `x`x1`x2;
    `y;
    .ml.online.sgd.linearRegression;
    "q";
    `yhat;
    .qsp.use (!) . flip(
      (`model;"sgd");
      (`modelArgs;(1b;`maxIter`gTol`seed!(100;-0w;42)));
      (`predict;1b)
      )
    ]
  .qsp.write.toConsole[]

Python Pipeline:

```py
// spec.py

import kxi.sp as sp
import pykx as kx

sp.run(sp.read.from_callback('publish')
    | sp.transform.replace_infinity()
    | sp.ml.registry.fit(['x', 'x1', 'x2'],
                         'y',
                         kx.ml.online.sgd.linearRegression,
                         'q',
                         'yhat')
    | sp.write.to_console())
```

Docker Compose

Create an appropriate Docker Compose file:

Note

In defining the below docker-compose.yaml to use the q or py specifications change the KXI_SP_SPEC environment variable in the worker section of the yaml.

# docker-compose.yaml

version: "3.3"
services:
  controller:
    image: registry.dl.kx.com/kxi-sp-controller
    ports:
      - 6000:6000
    environment:
      - KDB_LICENSE_B64                        # Which kdb+ license to use, see note below
    command: ["-p", "6000"]

  worker:
    image: registry.dl.kx.com/kxi-ml
    ports:
      - 5000
    volumes:
      - .:/app                                 # Bind in the spec.q file
    environment:
      - KXI_SP_SPEC=/app/spec.q                # Point to the bound spec.[q|py] file
      - KXI_SP_PARENT_HOST=controller:6000     # Point to the parent Controller
      - KDB_LICENSE_B64
      - AWS_ACCESS_KEY_ID                      # Use AWS_ACCESS_KEY_ID defined in process
      - AWS_SECRET_ACCESS_KEY                  # Use AWS_SECRET_ACCESS_KEY defined in process
      - AWS_REGION                             # Use AWS_REGION defined in process
    command: ["-p", "5000", "-aws", "s3://path-to-bucket"]

Start the containers:
```
$ docker-compose up -d
```