Upgrading
As new features are added to the kdb Insights Database, APIs and state may change. This document outlines considerations when upgrading between different versions of the kdb Insights Database.
Upgrade Compatibility
For changes that would result in a breaking change, an upwards compatible path will always be supported in minor release versions. Features may be deprecated in a minor release and could be removed in a major release upgrade.
Upgrading to 1.8
Configuring error trapping in 1.8
The environment variables controlling error trapping, called KXI_SM_ERROR_TRAP
and KXI_DA_ERROR_TRAP
in previous versions, have been replaced with a single variable, KXI_ERROR_TRAP
. For single-container DAP and SM, error trapping in child processes is controlled by KXI_WORKER_ERROR_TRAP
.
Upgrading to 1.7
IPC in 1.7
In 1.7, new security measures have been added for IPC connections to database services that restrict ad-hoc connections. Attempting to send an ad-hoc IPC request by directly sending an IPC message to a database process may result in the error IPC execution restricted. Rejecting function
. All IPC communication must use a supported API. To disable this feature and allow arbitrary connections, set KXI_SECURE_ENABLED=false
on the environment variable configuration for each target process.
Disabling IPC security
It is recommended that IPC security remain enabled for all production deployments. Disabling this level of security can allow users to modify internal state or access data they are not priviledged to see.
Sandboxes in 1.7
When configuring a sandbox process, the environment variable SBX_MAX_ROWS
could be provided to set the maximum rows for a sandbox process. This variable has been updated to include a KXI_
prefix and is now KXI_SBX_MAX_ROWS
. The old value will be supported for an additional release but is deprecated.
Upgrading to 1.5
Labels in 1.5
Querying with labels has changed in 1.5 to be a distinguished top level parameter. Previously, labels could be included as directly as parameters in requests to query APIs such as getData
or sql
. This has been changed such that labels are now nested under a labels object or prefixed with label_
in SQL. This change has been made to resolve an issue of unresolvable collisions between labels and custom API parameters or table columns. For example, previously if you had a table with a column called region
and an assembly with a label called region
, referencing region
in a query was ambiguous and could result in undesired results. With this change, region
in a query will always refer to the table column and a labels object is used to refer to the label called region
.
Deprecation notice
In the 1.5 release, the old label style is still supported but is deprecated and will result in a warning log in the Resource Coordinator. An extra environment variable ALLOW_OLD_LABEL_STYLE
has been added to the Resource Coordinator that preserves the old behaviour prior to 1.5 which defaults to "true"
. While enabled, this allows both the old and new label parameter style in the same query. In 1.6, this will now default to "false"
but can be enabled by overriding the environment variable setting. In 2.0, this feature will be removed entirely.
Upgrading
A slight modification is required to convert queries to the new format. Requests now must specify labels as a top level parameter for getData
requests or custom APIs. For SQL requests, labels in the query must have the label_
prefix.
Get Data
For full API details, see the getData
reference page.
Gateway URL
The GATEWAY
variable below is defined as an IPC connection to the Service Gateway. For example `:insights-qe-gateway:5050
would connect to the query environment gateway within an insights
namespace.
Prior to 1.5, the region
label in a request would be top level in the argument field.
args: (!) . flip (
(`table ; `trace);
(`region ; `$"us-east-1");
(`startTS ; .z.p - 0D00:05:00); // Select the last 5 minutes of data
(`endTS ; .z.p)
)
GATEWAY (`.kxi.getData; args; `; ()!())
In 1.5, this should now be nested under a labels
object.
args: (!) . flip (
(`table ; `trace);
(`labels ; enlist[`region]!enlist`$"us-east-1");
(`startTS ; .z.p - 0D00:05:00); // Select the last 5 minutes of data
(`endTS ; .z.p)
)
GATEWAY (`.kxi.getData; args; `; ()!())
Gateway URL
The $GATEWAY
variable should point at your kdb Insights install. For a microservice install, this will be the hostname of the install using port 8080. For an enterprise install, this is your $INSIGHTS_HOSTNAME
with /servicegateway
as the URL prefix.
Prior to 1.5, the region
label in a request would be top level in the argument field.
curl -X POST "$GATEWAY/kxi/getData" \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-H "Authorization: Bearer $INSIGHTS_TOKEN" \
-d "$(jq -n \
--arg startTS "$(date -u '+%Y.%m.%dD%H:00:00')" \
--arg endTS "$(date -u '+%Y.%m.%dD%H:%M%:%S')" \
'{
table : "trace",
region : "us-east-1",
startTS : $startTS,
endTS : $endTS
}' | jq -cr .)"
In 1.5, this should now be nested under a labels
object.
curl -X POST "$GATEWAY/kxi/getData" \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-H "Authorization: Bearer $INSIGHTS_TOKEN" \
-d "$(jq -n \
--arg startTS "$(date -u '+%Y.%m.%dD%H:00:00')" \
--arg endTS "$(date -u '+%Y.%m.%dD%H:%M%:%S')" \
'{
table : "trace",
labels : { region: "us-east-1" },
startTS : $startTS,
endTS : $endTS
}' | jq -cr .)"
SQL
For full API details, see the sql
reference page.
Gateway URL
The GATEWAY
variable below is defined as an IPC connection to the Service Gateway. For example `:insights-qe-gateway:5050
would connect to the query environment gateway within an insights
namespace.
Prior to 1.5, the exchange
label in a query would be referenced directly.
query: "select date,sym,avg(price) from trade ",
"where (date between '2021.01.01' and '2021.01.07') ",
"and (exchange='nyse') group by date,sym";
GATEWAY (`.kxi.sql; enlist[`query]!enlist query;`;()!())
In 1.5, this should now have a label_
prefix.
query: "select date,sym,avg(price) from trade ",
"where (date between '2021.01.01' and '2021.01.07') ",
"and (label_exchange='nyse') group by date,sym";
GATEWAY (`.kxi.sql; enlist[`query]!enlist query;`;()!())
Gateway URL
The $GATEWAY
variable should point at your kdb Insights install. For a microservice install, this will be the hostname of the install using port 8080. For an enterprise install, this is your $INSIGHTS_HOSTNAME
with /servicegateway/qe
as the URL prefix.
Prior to 1.5, the exchange
label in a query would be referenced directly.
select date,sym,avg(price) from trade
where (date between '2021.01.01' and '2021.01.07') and (exchange='nyse')
group by date,sym
In 1.5, this should now have a label_
prefix.
select date,sym,avg(price) from trade
where (date between '2021.01.01' and '2021.01.07') and (label_exchange='nyse')
group by date,sym
This example uses the above query set as a variable called $QUERY
.
curl -X POST "$GATEWAY/kxi/sql" \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-H "Authorization: Bearer $INSIGHTS_TOKEN" \
-d "$(jq -n --arg query "$QUERY" '{ query: $query }' | jq -cr .)"
Batch ingest in 1.5
To enable fault tolerant batch ingests, a new _batchIngest
signal has been added to the Storage Manager. To facilitate this change, any tickerplant based streams will require an additional table to be defined to support this table.
(`$"_batchIngest")set ([] time:"n"$(); sym: `$(); session:`$(); address:`$(); callback:());
Additionally, if your install is using the provided rt_tick_client_lib.q
from a previous install, it must be upgraded to the newly supplied version to support dictionary signals.
TP client code
```{.q title="rt_tick_client_lib.q"} / TODO get rid of this redundant file
/ https://kxl.atlassian.net/browse/KXI-36907
// === internal tables without time/sym columns ===
.rt.NO_TIME_SYM:$("_prtnEnd";"_reload";"_batchIngest";"_batchDelete";"_schemaChange")
.rt.IS_DICT:
$("_batchIngest";"_batchDelete";"_schemaChange")
// === rt publish and push functions === .rt.push:{'"cannot push unless you have called .rt.pub first"}; // will be overridden
.rt.pub:{[topic]
if[not 10h=type topic;'"topic must be a string"];
h:neg hopen hsym$getenv
KXI_RT_NODES;
.rt.push:{[nph;payload]
x:$[98h=type x:last payload; value flip x;99h=type x;enlist each value x;x];
if[(t:first payload)in .rt.NO_TIME_SYM; x:(count[first x]#'(0Nn;)),x];
nph(
.u.upd;t;x);}[h;];
.rt.push }
// === rt update and subscribe ===
if[upd in key
.; '"do not define upd: rt+tick will implement this"];
if[end in key
.u; '"do not define .u.end: rt+tick will implement this"];
if[not type key`.rt.upd; .rt.upd:{[payload;idx] '"need to implement .rt.upd"}];
.rt.sub:{[topic;startIdx;uf] if[not 10h=type topic;'"topic must be a string"];
//connect to the tickerplant
h:hopen hsym$getenv
KXI_RT_NODES;
//initialise our message counter .rt.idx:0;
// === tick.q will call back to these ===
upd::{[uf;t;x]
if[not type x; x:flip(cols .rt.schema t)!x]; // for log replay
if[t in .rt.NO_TIME_SYM; x:time
sym _x];
if[t in .rt.IS_DICT; x:first x];
uf[(t;x);.rt.idx];
.rt.idx+:1; }[uf];
.com_kx_secure.addAPI`upd;
.u.end:{.rt.idx:.rt.date2startIdx x+1};
//replay log file and continue the live subscription if[null startIdx;startIdx:0W]; // null means follow only, not start from beginning
//subscribe
res:h "(.u.sub[;
]; .u i
L; .u.d)";
.rt.schema:(!/)flip res 0; // used to convert arrays to tables during log replay
//if start index is less than current index, then recover if[startIdx<.rt.idx:(.rt.date2startIdx res 2)+res[1;0]; .rt.recoverMultiDay[res[1];startIdx]]; }
//100 billion records per day .rt.MAX_LOG_SZ:"j"$1e11;
.rt.date2startIdx:{("J"$(string x) except ".")*.rt.MAX_LOG_SZ};
.rt.recoverMultiDay:{[iL;startIdx]
//iL - index and Log (as can be fed into -11!)
i:first iL; L:last iL;
//get all files in the same folder as the tp log file
files:key dir:first pf:vs last L;
//get the name of the logfile itself
fileName:last pf;
//get all the lognameXXXX.XX.XX files (logname is sym by default - so usually the files are of the form sym2021.01.01, sym2021.01.02, sym2021.01.03, etc)
files:files where files like (-10_ string fileName),"*";
//from those files, get those with dates in the range we are interested in
files:
sv/: dir,/:asc files where ("J"$(-10#/:string files) except: ".")>=startIdx div .rt.MAX_LOG_SZ;
//set up upd to skip the first part of the file and revert to regular definition when you hit start index
upd::{[startIdx;updo;t;x] \([.rt.idx>=startIdx; [upd::updo; upd[t;x]]; .rt.idx+:1]}[startIdx;upd];
//read all of all the log files except the last, where you read up to 'i'
files:0W,/:files; files[(count files)-1;0]:i;
//reset .rt.idx for each new day and replay the log file
{.rt.idx:.rt.date2startIdx "D"\)-10#string x 1; -11!x}each files;
};
//100 billion updates per day - 1e11 //30210610*1e11
```