S3 Historial Data
This example will deploy the service gateway, with Historial Databases (HDBs) that mount S3 buckets.
You will need:
- An existing cloud storage bucket, with a date partitioned database
- A Kubernetes Cluster with read access to the above bucket
- local copy of sym file, and par.txt for the cloud storage buckets
- Kubernetes secrets for a Kx license and image pull secrets
- Kubernetes service account with S3 Read Access, named
kx-s3-read-access
.
Note
Image pull secrets and Kx license info variables are named kx-repo-access
and kx-license-info
respectively.
You may need to rename these references.
For reference on starting a new cluster, see:
Introducing fine-grained IAM roles for service accounts
Deployment
An example kubernetes configuration file is available here.
If you do not already have one, create a Service Account to allow read access to Amazon S3, named kx-s3-read-access
. This can be done using eksctl.
eksctl create iamserviceaccount --name kx-s3-read-access\
--namespace <your namespace>\
--region <your region>\
--cluster <your cluster>\
--attach-policy-arn arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess\
--approve
If not using the default
namespace, find and replace the string default
within s3Deployment.yml.
# Examples of Find and Replace using sed
sed -i 's|default|yourNamespace|g' s3Deployment.yml # Linux
sed -i "" 's|default|yourNamespace|g' s3Deployment.yml # Mac OSX
Create a ConfigMap with par.txt and your buckets sym file.
kubectl create configmap kxinsights-s3-configmap\
--from-file=sym=/path/to/sym\
--from-file=par.txt=/path/to/par.txt
The assembly configuration in the example is setup to describe the canonical trade and quote schemas used in many Kx proof of concepts. If you are not using trades or quotes data, you will need to modify the assembly section in the s3Deployment.yml.
Install resources and run the deployment with:
kubectl apply -f s3Deployment.yml
kubectl get pods
can be used to view all of the running pods.
# kubectl get pods
NAME READY STATUS RESTARTS AGE
kxinsights-aggregator-786d9bc674-fz94d 1/1 Running 0 11s
kxinsights-aggregator-786d9bc674-p4k6f 1/1 Running 0 11s
kxinsights-aggregator-786d9bc674-rxg99 1/1 Running 0 11s
kxinsights-hdb-da-0 1/1 Running 1 10s
kxinsights-hdb-da-1 1/1 Running 1 6s
kxinsights-hdb-da-2 1/1 Running 0 4s
kxinsights-resource-coordinator-5664d4f898-m6plb 1/1 Running 0 11s
kxinsights-sg-gateway-54596c8fc7-dwcnv 1/1 Running 0 11s
kxinsights-sg-gateway-54596c8fc7-jm8fc 1/1 Running 0 11s
kxinsights-sg-gateway-54596c8fc7-kqg4f 1/1 Running 0 11s
kubectl get services
can be used to print the IP address of the Gateway LoadBalancer, shown below with the example IP of 192.0.2.127
.
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kxinsights-aggregator ClusterIP 10.0.0.111 <none> 5070/TCP 2m59s
kxinsights-hdb-da ClusterIP None <none> 5080/TCP 2m59s
kxinsights-resource-coordinator ClusterIP 10.0.0.112 <none> 5060/TCP 2m59s
kxinsights-sg-gateway LoadBalancer 10.0.0.113 192.0.2.127 8080:31881/TCP,5050:31943/TCP 2m59s
Query
Using the example IP of 192.0.2.127
, we can submit queries over q-IPC, or HTTP.
h:hopen `:192.0.2.127:5050
x:(`.kxi.getData;
`table`region`startTS`endTS`filter!(`trade;`Canada;-0wp;0wp;"sym=`ODLI, size within 50 100");
`callback;
(0#`)!());
// Sync, callback not used
show h x;
// async
callback:{[x] show (`callback; x)};
neg[h] x
Note
By default, a classic Amazon Elastic LoadBalancer
will have an idle timeout of 60 seconds.
You may wish to modify this, to beyond 60, if you aren't going to query that frequently.
https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/config-idle-timeout.html
Using HTTP, with curl
.
curl -X POST\
--header "Content-Type: application/json"\
--header "Accepted: application/json"\
--data '{ "table": "trade", "startTS":"2021.08.31D00:00:00.000000000", "endTS":"2021.09.01D00:00:00.000000000", "region": "Canada"}'\
"http://192.0.2.127:8080/kxi/getData"
Note
When using HTTP, timestamps must include all digits, nulls and infinites yet not supported.