S3 Historial Data
This example will deploy the service gateway, with Historial Databases (HDBs) that mount S3 buckets.
You will need:
- An existing cloud storage bucket
- An existing Kubernetes cluster or permission to create one
- local copy of sym file, and par.txt for the cloud storage buckets
- Kubernetes secrets for a Kx license and image pull secrets
- Kubernetes service account with S3 Read Access, named
kx-s3-read-access
.
Note
Image pull secrets and Kx license info variables are named kx-repo-access
and kx-license-info
respectively.
You may need to rename these references.
For reference on starting a new cluster, see:
Introducing fine-grained IAM roles for service accounts
Uploading your database to S3
Skip this section if you already have uploaded your database to S3.
To upload a database to S3, use aws s3 cp
.
aws s3 cp "/path/to/file.txt" s3://kxinsights-marketplace-data/ --recursive
For reference, if you want to try out this example and you have no database in mind, a very simple database of n-rows per date can be created with:
// generate data for today + last few days
n:1000000;
d:asc .z.d - til 3;
{[d;n]sv[`;.Q.par[`:data/;d;`trade],`]set .Q.en[`:data/;([]sym:`$'n?.Q.A;time:("p"$d)+til n;price:n?100f;size:n?50f)];}[;n] each d;
Note the sym
file at the top of the database directory. You will need this file later. Do not confuse this with the sym column inside the trade table.
data
├── 2022.05.02
│ └── trade
│ ├── price
│ ├── size
│ ├── sym
│ └── time
├── 2022.05.03
│ └── trade
│ ├── price
│ ├── size
│ ├── sym
│ └── time
├── 2022.05.04
│ └── trade
│ ├── price
│ ├── size
│ ├── sym
│ └── time
└── sym
Creating a cluster
An example kubernetes configuration file is available here.
This example describes the canonical trade and quote schemas used in many KX proof of concepts. If you are not using trades or quotes data, you will need to modify the assembly section within the s3Deployment.yml to correct the schema.
If you do not already have one, create a Service Account to allow read access to Amazon S3, named kx-s3-read-access
. This can be done using eksctl.
eksctl create iamserviceaccount --name kx-s3-read-access\
--namespace <your namespace>\
--region <your region>\
--cluster <your cluster>\
--attach-policy-arn arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess\
--approve
If not using the default
namespace, find and replace the string default
within s3Deployment.yml.
# Examples of Find and Replace using sed
sed -i 's|default|yourNamespace|g' s3Deployment.yml # Linux
sed -i "" 's|default|yourNamespace|g' s3Deployment.yml # Mac OSX
Setup License and Repository
If this is your first time using an Kx Insights example you will need to create license and repository secret:
kubectl create secret docker-registry kx-repo-access \
--docker-username=${NEXUS_USER} \
--docker-password=${NEXUS_PASSWORD} \
--docker-server=registry.dl.kx.com
kubectl create secret generic kx-license-info \
--from-literal=license=$(base64 -w 0 < $QLIC/kc.lic)
If you already have existing secrets, take care to update the names within the reference file to use your names.
Reference these license secrets with container env variables:
env:
- name: KDB_LICENSE_B64
valueFrom:
secretKeyRef:
name: kx-license-info
key: license
Upload par.txt and sym as configmap
Create a ConfigMap with par.txt and your buckets sym file.
If you do not already have a par.txt
, you can create one. It should be a single line text file, with the S3 bucket location of your database.
The sym file would be located with the partitioned database and would have be generated when you first created it.
An example par.txt
looks like:
s3://kxinsights-marketplace-data/zd1726/db
Upload both the par.txt
and the sym file as a config map using kubectl
:
kubectl create configmap kxinsights-s3-configmap\
--from-file=sym=/path/to/sym\
--from-file=par.txt=/path/to/par.txt
Set volumeMounts and mounts
Within the reference file, you will want to modify the volume mounts and mounts sections to reference the sym and par.txt
from the configmap above.
Ensure that the paths within volumeMounts reference the kxinsights-s3-configmap
you created above.
Ensure that the mountPath(s)
have the same values as those in the elements.dap.instances.HDB.sym
and elements.dap.instances.HDB.par
.
volumeMounts:
- name: s3config
mountPath: /opt/kx/data/hdb/par.txt
subPath: par.txt
- name: s3config
mountPath: /opt/kx/data/hdb/sym
subPath: sym
...
elements:
dap:
instances:
HDB:
mountName: hdb
sym: /opt/kx/data/hdb/sym
par: /opt/kx/data/hdb/par.txt
Ensure that the mount is type object
, and that the baseURI
the mount location that sym and par.txt will be placed into.
mounts:
hdb:
type: object
baseURI: file:///opt/kx/data/hdb
partition: none
If the baseURI is set to a folder that does not contain sym + par.txt, or you place sym and par in different folders, we will copy sym + par.txt into that baseURI location. This will require that the pod has write access to that directory location.
Set environment variables
Set AWS_REGION
as an environment variable:
env:
- name: KX_TRACE_OBJSTR
value: "1"
- name: AWS_REGION
value: "us-east-2"
Make sure that KXI_SG_RC_ADDR
uses the same namespace you are using:
If your namespace was hello
, your value should be:
- name: KXI_SG_RC_ADDR
value: kxinsights-resource-coordinator.hello.svc:5060
You may also want to create secrets for AWS credentials and env if your bucket exists outside your clusters reach.
$ kubectl create secret generic aws-access-secret\
--from-literal=AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}\
--from-literal=AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
You can reference those secrets as environment variables with:
env:
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: aws-access-secret
key: AWS_ACCESS_KEY_ID
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: aws-access-secret
key: AWS_SECRET_ACCESS_KEY
Tip
You should not need to set credentials if kx-s3-read-access
was used earlier.
Configure the schema
You will need to modify the schema to match the shape of your database.
Deploy the Assembly
Install resources and run the deployment with:
kubectl apply -f s3Deployment.yml
kubectl get pods
can be used to view all of the running pods.
# kubectl get pods
NAME READY STATUS RESTARTS AGE
kxinsights-aggregator-786d9bc674-fz94d 1/1 Running 0 11s
kxinsights-aggregator-786d9bc674-p4k6f 1/1 Running 0 11s
kxinsights-aggregator-786d9bc674-rxg99 1/1 Running 0 11s
kxinsights-hdb-da-0 1/1 Running 1 10s
kxinsights-hdb-da-1 1/1 Running 1 6s
kxinsights-hdb-da-2 1/1 Running 0 4s
kxinsights-resource-coordinator-5664d4f898-m6plb 1/1 Running 0 11s
kxinsights-sg-gateway-54596c8fc7-dwcnv 1/1 Running 0 11s
kxinsights-sg-gateway-54596c8fc7-jm8fc 1/1 Running 0 11s
kxinsights-sg-gateway-54596c8fc7-kqg4f 1/1 Running 0 11s
kubectl get services
can be used to print the IP address of the Gateway LoadBalancer, shown below with the example IP of 192.0.2.127
.
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kxinsights-aggregator ClusterIP 10.0.0.111 <none> 5070/TCP 2m59s
kxinsights-hdb-da ClusterIP None <none> 5080/TCP 2m59s
kxinsights-resource-coordinator ClusterIP 10.0.0.112 <none> 5060/TCP 2m59s
kxinsights-sg-gateway LoadBalancer 10.0.0.113 192.0.2.127 8080:31881/TCP,5050:31943/TCP 2m59s
Query
Using the example IP of 192.0.2.127
, we can submit queries over q-IPC, or HTTP.
h:hopen `:192.0.2.127:5050
x:(`.kxi.getData;
`table`region`startTS`endTS`filter!(`trade;`Canada;-0wp;0wp;"sym=`ODLI, size within 50 100");
`callback;
(0#`)!());
// Sync, callback not used
show h x;
// async
callback:{[x] show (`callback; x)};
neg[h] x
Note
By default, a classic Amazon Elastic LoadBalancer
will have an idle timeout of 60 seconds.
You may wish to modify this, to beyond 60, if you aren't going to query that frequently.
https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/config-idle-timeout.html
Using HTTP, with curl
.
curl -X POST\
--header "Content-Type: application/json"\
--header "Accepted: application/json"\
--data '{ "table": "trade", "startTS":"2021.08.31D00:00:00.000000000", "endTS":"2021.09.01D00:00:00.000000000", "region": "Canada"}'\
"http://192.0.2.127:8080/kxi/getData"