Release Notes - kdb Insights Enterprise 1.1.0

Release for kdb Insights Enterprise.

Release Date

2022-06-07

Features

[NEW] Storage and Query

Late and out-of-order data handling
Data received out of order is recorded with an appropriate ingest timestamp
Late data is recorded, irrespective of where its correct destination is in the lifecycle (RDB, IDB, HDB)
Late data is included by default for all requests, q (getData) and ANSI SQL
Late data can be filtered out (for example, to see only data present in the system at a particular point in time)
Performance is not significantly impacted by late data arrival, including for object storage tiers
Reference data can be queried in the Explore Window, in a data pipeline and as part of the getData() API
Support for non time-series reference data in Database
Data in reference tables can be joined to time-series tables in free-form q queries
Improved SQL support for queries (joins and complex queries, except for reference tables)
Compression support, including object storage
Improved resiliency and performance

[NEW] User Interface

Scratchpads
Data can be pulled into the scratchpad using either sql or getData()
Freeform q, SQL or Python code can be execute on that data to further filter, aggregate and join data to explore specific records
Python/q functions or a model can be applied to that data to gain insights
A model can be fitted using that data
Machine Learning plugins can be added to a Pipeline created from the user interface
More intuitive creation of data ingest pipelines
Observability embedded into the UI (events and logs tab)
Log forwarding to each providers' log aggregator is now enabled by default in the Terraform scripts.

[NEW] Reports

Visualization is available for ingested data using the getData() API

[NEW] Observability

View Database and Pipeline logs and events from the UI
Ability to Export Diagnostics, including the logs, events and state for further diagnosis
Further integration with cloud monitoring tools
Additional performance metrics collected

[NEW] Reliable Transport Stream

Subscribers have the option to override the detail truncation options for the stream logs that they have processed
Improved recovery and performance

Artifacts

type	location
Infrastructure	kxi-terraform-1.1.0.tgz
Platform	insights-1.1.0.tgz
Operator	kxi-operator-1.1.0.tgz
CLI	kxicli-0.9.0-py3-none-any.whl
ODBC Driver	kodbc 1.1.0
Java SDK	java-sdk 1.1.0

Upgrade notes

Data loss of UI assembly definitions

Assembly definitions built from the Insights UI will be lost as part of an uninstall of the application. The assembly components (assemblies, databases, pipelines, reports, schemas, streams) are stored on a PVC in the application. This storage is incorrectly torn down as part of an uninstall, resulting in data loss on re-install or upgrade.

To prevent this data loss the following commands can be run to backup the data. This should be run prior to the uninstall, with kubectl configured to the target cluster and namespace.

kubectl exec insights-kxi-controller-0 -- tar -C /kxic/data -czf - . > kxi-controller-data.tgz

To restore, the user should run the following command post-install.

cat kxi-controller-data.tgz |  kubectl exec insights-kxi-controller-0 --stdin -- tar -C /kxic/data -xzf -

Known Issues

Service GatewayUser InterfaceLicensingSDKs

If you request too much data in a single getData(), the request will fail. To overcome this issue you should request less data per getData() call. The error message returned will look like the example below

   {"header":{"http":"json","corr":"6dd8f0c5-1895-49c9-a87a-636a834af370","logCorr":"6dd8f0c5-1895-49c9-a87a-636a834af370","client":":10.0.10.76:5050","api":".kxi.getData","protocol":"gw","numRP":1,"ogRcID":"10.0.10.100:5060","to":"2022-04-07T11:41:35.717000000","retryCount":0,"rc":42,"ac":10,"ai":"Agg died"},"payload":[]}

getData does not support connecting reference data column names that do not match. If the name of a column does not match the name of the column it is a foreign key for, then the getData reference join will fail. Example: The trade table has a column sym which is a foreign key to the msym column in the table market. The join will run into an issue since the column names are different.

The user-defined variables & functions in the Explore tab are shared across users. (We recommend that you prefix all your variables with characters that are unique to your user.)
Assemblies applied directly through kubectl cannot be edited in the User Interface. However, they can be queried via the Explore tab.
Passwords used for authenticating to third-party services are currently stored as part of the pipeline configuration. This will be corrected in the next release. It is recommended that instead of using a password directly that a Kubernetes secret is used whenever possible.
Currently users will be logged out but will not be notified via the UI. If users consistently encounter errors, we suggest they log out and in again.
The Explore tab allows users to execute arbitrary code within the cluster. To lock down the environment, this can be restricted with role based authentication using Keycloak. Restrict the roles relating to insights.scratch.*
The Explore tab will not print results to the console for multi-line python code. Results are only printed if there is just one line of python code.

On startup of pods, the following error might be observed once roughly after three minutes of a pod starting up no acct for 3x period, exiting. This stems from a temporary startup job not shutting down correctly. It's independent from the main processes and doesn't indicate any application fault.
On initial startup of kdb Insights Enterprise, there may be some noise printed in the logs while the system initialises unable to flush accounting logs. This relates to the capturing of consumption-based license logs and is thrown while all pods get into a running state. It does not indicate any fault in the application and all data should be flushed correctly after a short period.

The KX ODBC driver is not currently available for use on Windows, therefore as the Java SDK requires the ODBC driver to be installed, neither SDKs are currently available on Windows.
When either the Java SDK or ODBC driver is being used, if Insights Information Service fails to respond to the SDK for at least 1 minute it will deem their own configuration too old and will disconnect and stop publisher into Insights. This can be observed when the control plane is inaccessible (for example during the default GCP maintenance window, 5am UTC, if it hasn’t been disabled). Any application using the SDKs will need to account for this and introduce some restart logic which will trigger the SDK to restart, and force a reconnect attempt by the SDK.

Sample extract from the application log:

2022-05-24 20:30:48.103035      INFO    1       Started replicator to :34.142.40.180:5000, pid 41697
2022-05-24 20:31:48.127821      INFO    1       Process 41697 terminated
2022-05-24 20:31:48.128196      INFO    1       Started replicator to :34.142.40.180:5000, pid 41714
2022-05-24 20:36:12.700002      INFO    1       Process 41714 terminated
2022-05-24 20:36:12.700467      INFO    1       Started replicator to :34.142.40.180:5000, pid 41772
2022-05-25 05:04:50.077286      INFO    0       New configuration fetched
2022-05-25 05:04:50.092259      WARN    0       Config parse error at offset 0: Expected opening bracket
2022-05-25 05:05:50.046686      ERROR   0       Config is too old
2022-05-25 05:05:50.046813      INFO    0       Waiting for the driver to sync the data

Backward Compatibility

If you're upgrading from a previous version or the CRD is already installed, helm will not update the definition automatically. See Operator Overview Helm Warning for details on how to rectify this.

License changes

From 1.1.0 kdb Insights has moved to consumption-based licensing using a new license file, kx.lic. The user should generate the license themselves using the klic self-service tool. The documentation for this is available here. After following the klic docs, a new kx.lic should be available for your cluster.

Then you can upload it as a Kubernetes secret using the instructions here.

Alternatively delete your existing license secret and re-run the CLI install (kxi install ..), choosing the N when prompted if you have an existing license secret. Assuming the license secret is called kxi-license, the command below will delete it.

kubectl delete secret kxi-license

To revert to the pre-1.1.0 behaviour, the following Helm values should be set for the deployment. For details on this file see here.

global:
  license:
    ..
    onDemand: true
    asFile: false

kxi-acc-svc:
  enabled: false

Single pod Storage Manager

By default, when upgrading to 1.1.0 the Storage Manager will no longer create independent containers for the various processes used to manage data persistence. These processes now all run under a single container as child processes. As part of the upgrade, it is optionally encouraged that any assembly data workflows deployed should be upgraded to align any resources with this change.

The change requires the user to remove resources configuration for eoi, eod and dbm sections of an assembly and update the sm.k8sPolicy with a resources configuration for all these processes which will run as a single container within the sm pod.

For example

  sm:
    ..
    k8sPolicy:
      resources:
        requests:
          memory: "14Gi"
          cpu: "2000m"
        limits:
          memory: "14Gi"
          cpu: "4000m"

However if no actions are taken, kdb Insights Enterprise will remain backwards compatible with the existing configuration and will use the sum of all CPU and memory resources under the sm element, using this sum to populate the sm.k8sPolicy.resources section under the sm element.