SQL2
Introduction
SQL2 is a significant enhancement to the SQL functionality in kdb Insights, it is designed to address the memory, performance, and implementation limitations of its predecessor, SQL. The new implementation is now fully integrated with kdb Insights, eliminating previous compatibility issues and enabling a more seamless experience. SQL2 is built to work natively within the kdb Insights query architecture, ensuring that it evolves alongside the product, reducing the likelihood of compatibility issues that were common in previous versions.
The primary goal for SQL2 is to achieve substantial compliance with the PostgreSQL standard. While we may not reach 100% compliance due to the underlying kdb database architecture, this objective serves as our guiding principle for ongoing development and improvements.
The SQL2 API allows for the specification of ad-hoc SQL queries that can be distributed and aggregated across the system. However, the SQL2 API only supports a subset of the full SQL operators and functions specified here.
Supported Syntax
The SQL2 implementation supports the following query structures:
SELECT [DISTINCT] ... FROM TABLE [WHERE ...] [GROUP BY ...] [ORDER BY ...] [LIMIT ...] [OFFSET ...]
Warning
The SQL2 API does not support INSERT
, CREATE
, or DELETE
statements.
Query Examples
The following examples demonstrate the supported SQL query syntax:
SELECT * FROM quote
SELECT count(*) FROM trade
SELECT DISTINCT country,currency FROM exchange
SELECT sym,price FROM trade WHERE date='2024.01.26' ORDER BY price DESC LIMIT 10
SELECT min(ask) as min_price,max(bid) as max_price FROM quote WHERE label_exchange = 'nyse' AND time >= '2024.01.26 12:00:00'
Calling SQL vs SQL2
SQL2 is enabled by default in query environments. If you want to enable SQL2 in environments not primarily designed for queries, include .kxi.sql2
in the comma-separated list of allowed sandbox APIs in the KXI_ALLOWED_SBX_APIS
environment variable.
For example:
KXI_ALLOWED_SBX_APIS=".kxi.sql,.kxi.sql2"
-
If you don’t activate SQL2, all SQL queries continue to use the existing SQL API, as in versions earlier than 1.9.
-
Once enabled, the new SQL2 API aims to phase out the old SQL API over time as we implement new features. You don’t need to take any extra steps to use it, just call
.kxi.sql
as you normally would. If the SQL2 API can handle the query, it does. If it can’t, the system automatically falls back to the old SQL API.
Target SQL or SQL2 Explicitly in the Following ways
- To call SQL2 explicitly, use
.kxi.sql2
instead of.kxi.sql
. - To target a specific version in the API request, use the
opts
parameter and specify version=2 (for SQL2) or version=1 (for SQL).
For example:
Gateway URL
The GATEWAY
variable below is defined as an IPC connection to the Service Gateway. For example, `:insights-qe-gateway:5050
connects to the query environment gateway within an insights
namespace.
/ Target SQL2
GATEWAY (`.kxi.sql2;enlist[`query]!enlist("SELECT sym,realTime,price,size FROM trade LIMIT 5");`;()!())
/ Target SQL2
GATEWAY (`.kxi.sql;enlist[`query]!enlist("SELECT sym,realTime,price,size FROM trade LIMIT 5");`;enlist[`version]!(enlist 2))
/ Target SQL (version=1)
GATEWAY (`.kxi.sql;enlist[`query]!enlist("SELECT sym,realTime,price,size FROM trade");`;enlist[`version]!(enlist 1))
The body of the response is a tuple with a header and the payload.
Response Header
rcvTS | 2024.03.14D21:31:00.414000000
corr | 554ed9a1-e138-4948-8aa0-33bf371bbfe1
protocol | `gw
logCorr | "554ed9a1-e138-4948-8aa0-33bf371bbfe1"
client | `:10.1.1.1:5050
api | `.kxi.sql2
userName | `
version | 2
userID | 00000000-0000-0000-0000-000000000000
retryCount| 0
to | 2024.03.14D21:31:00.414000000
agg | `:10.1.1.1:5070
pvVer | 8
rpID | 0
refVintage| -0W
startTS | 2024.03.14D20:25:59.000000001
endTS | 0Wp
rc | 0h
ac | 0h
ai | ""
Inspecting the payload of the response, we see the query result.
sym realTime price size
----------------------------------------------------
JBMH.CAN 2024.01.15D00:00:04.500000000 127.9574 880
FNLH.CAN 2024.01.15D00:00:09.500000000 403.3123 853
DNDF.CAN 2024.01.15D00:00:14.500000000 357.7171 969
MGPJ.CAN 2024.01.15D00:00:19.500000000 87.40093 753
JBMH.CAN 2024.01.15D00:00:24.500000000 127.994 984
..
Gateway URL
The $GATEWAY
variable should point at your kdb Insights install. For a microservice install, this is the hostname using port 8080. For an enterprise install, this is your $INSIGHTS_HOSTNAME
with /servicegateway/qe
as the URL prefix.
SELECT sym,realTime,price,size FROM trade
This example uses the above query set as a variable called $QUERY
.
# Target SQL2
curl -X POST "$GATEWAY/kxi/sql2" \
--header "Content-Type: application/json" \
--header "Accepted: application/json" \
--data "{query: \"$QUERY\"}"
# Target SQL2
curl -X POST "$GATEWAY/kxi/sql" \
--header "Content-Type: application/json" \
--header "Accepted: application/json" \
--data "{query: \"$QUERY\" , opts: { version : 2 }}"
# Target SQL (version=1)
curl -X POST "$GATEWAY/kxi/sql" \
--header "Content-Type: application/json" \
--header "Accepted: application/json" \
--data "{query: \"$QUERY\" , opts: { version : 1 }}"
The body of the response is an object with a header and the payload.
{ "header": { ... }, "payload": { ... } }
Response Header
"header":{
"rcvTS":"2024-03-14T22:05:47.177000000",
"corr":"a284ade8-b710-430f-a071-a2d01d8bcc55",
"protocol":"gw",
"logCorr":"a284ade8-b710-430f-a071-a2d01d8bcc55",
"client":":10.1.1.1:30000",
"http":"json",
"api":".kxi.sql2",
"userName":"",
"userID":"00000000-0000-0000-0000-000000000000",
"retryCount":0,
"to":"2024-03-14T22:07:47.177000000",
"agg":":10.1.1.1:5070",
"pvVer":12,
"rpID":0,
"refVintage":-9223372036854775807,
"startTS":"2024-03-14T20:25:59.000000001",
"endTS":"2292-04-10T23:47:16.854775807",
"rc":0,
"ac":0,
"ai":""}
The query result is displayed in the response payload.
[
{
"sym":"JBMH.CAN",
"realTime":"2024-01-15T00:00:04.500000000",
"price":127.9574,
"size":880
},
{
"sym":"FNLH.CAN",
"realTime":"2024-01-15T00:00:09.500000000",
"price":403.3123,
"size":853
},
{
"sym":"DNDF.CAN",
"realTime":"2024-01-15T00:00:14.500000000",
"price":357.7171,
"size":969
},
{
"sym":"MGPJ.CAN",
"realTime":"2024-01-15T00:00:19.500000000",
"price":87.40093,
"size":753
},
{
"sym":"JBMH.CAN",
"realTime":"2024-01-15T00:00:24.500000000",
"price":127.994,
"size":984
}
]
Query Structure
Query Structure | SQL | SQL2 | Differences |
---|---|---|---|
SELECT ... FROM TABLE | Supported | Supported | Both SQL and SQL2 support basic SELECT queries. SQL2 has enhanced performance for queries. |
WHERE ... | Supported | Supported | Both SQL and SQL2 support WHERE clause for filtering. SQL2 has improved performance for large datasets. |
GROUP BY ... | Supported | Supported | Both SQL and SQL2 support GROUP BY clause for aggregation. SQL2 has added support for multiple aggregations in a single query. |
ORDER BY ... | Unsupported | Supported | SQL2 supports ORDER BY clause for sorting. |
LIMIT ... | Unsupported | Supported | SQL2 supports LIMIT clause returning a subset of query results. |
OFFSET ... | Unsupported | Supported | SQL2 supports OFFSET clause which skips the specified number of rows before returning query results. |
Differences SQL vs SQL2 vs PostgreSQL
Feature | SQL | SQL2 | PostgreSQL | Differences |
---|---|---|---|---|
Compliance with PostgreSQL standard | Partial | Substantial | Full | SQL2 has substantial compliance with the PostgreSQL standard. |
Support for ad-hoc SQL queries | Yes | Yes | Yes | Both SQL and SQL2 support ad-hoc SQL queries. SQL2 has improved performance for complex queries. |
Automatic fallback to old SQL API | No | Yes | No | SQL2 supports automatic fallback to the old SQL API, ensuring backward compatibility. |
Virtual column placement in SELECT * |
Last | First | N/A | When doing SELECT * FROM ... , virtual columns (date and labels) are the first columns in SQL2, rather than the last. |
Label column naming convention | Prefixed with label_{label_name} |
Prefixed with label_{label_name} |
No prefix | SQL supports both the old label style and the new label style. Therefore, virtual label columns are always prefixed with label_{label_name} in SQL. |
Logic for unspecified output column names | Follows kdb Insights Core SQL behavior | Default kdb+/q behavior | PostgreSQL behavior | SQL2 has slightly different logic when returning columns without explicitly specifying the output column name. Refer to the examples below. |
Trailing zeros in floating-point values | Retained | Not retained | Retained | SQL2 does not retain trailing zeros in floating-point values. Refer to the Precision documentation for more information on how kdb+ handles precision. |
Default column names behavior | Standard SQL | kdb+ default | Standard SQL | If you don’t explicitly specify column names in SQL2 queries, the default column names follow the default kdb+ behavior, not PostgreSQL. |
Column Name Examples
The default column names follow the default kdb+ behavior, not PostgreSQL. Refer to the following examples:
=== "Postgres"
```sql
postgres=# SELECT count(*) FROM trade;
count
-------
100
postgres=# SELECT min(price),max(price) FROM trade;
min | max
-----+-----
10 | 25
postgres=# SELECT price*size FROM trade;
?column?
----------
250
700
1350
2200
```
=== "kdb Insights"
```q
// Use the last referenced column (or `x` if none), suffixed with 1, 2, 3, ... for repeated names.
q)SELECT count(*) FROM trade
x
---
100
q)SELECT min(price),max(price) FROM trade
price price1
------------
10 25
q)SELECT price*size FROM trade
size
----
250
700
1350
2200
```
- kdb+ has the concept of typed nulls. However, there’s no
String
type null (equivalent of char or varchar), so the untyped null(::)
is the appropriate equivalent. More information on this will be provided as string functions are added. - Symbols and strings are not interchangeable. More information on this will be provided as string functions are added.
- Table and column names are case-sensitive. For example,
SELECT * FROM MYTABLE
is Not equivalent toSELECT * FROM mytable
.
Known Limitations
Please note the following limitations when using SQL2:
-
Sorting Direction: You must specify all
ORDER BY
columns in theSELECT
clause. -
String Comparisons: The
in
and=
operators in SQL2 currently do not support string comparisons. -
Date and Time Restrictions: When restricting the date or partition time column in
WHERE
clauses in SQL2, irreversible operations are not allowed. Here are some examples to illustrate this:// This is acceptable in SQL2. SELECT * FROM myTable WHERE date+1 = '2024-01-26' // This is not supported in SQL2. SELECT * FROM myTable WHERE date % 2 = 0
-
Selects on Set Operations: Currently only
LIMIT
andORDER BY
clauses are supported on the results ofUNION
,EXCEPT
andINTERSECT
queries. For example:// This is acceptable in SQL2. (SELECT sym , price FROM TRADE) UNION (SELECT sym , bid FROM QUOTE) ORDER BY sym LIMIT 5 // This is not supported in SQL2. SELECT sym, price as test FROM (SELECT sym , price FROM TRADE) UNION (SELECT sym , bid FROM QUOTE) WHERE sym='ABC"
-
Join Restrictions: Currently a single join between two tables is supported with a single join
on
clause. The join clause must be=
and both the left and right-hand columns must be the same kdb+ datatype. For more information see the Joins section. For example:// This is acceptable in SQL2. SELECT * FROM (SELECT sym , price FROM TRADE) t INNER JOIN (SELECT sym , bid FROM QUOTE) q ON t.sym=q.sym // This is not supported in SQL2. SELECT * FROM (SELECT sym , price FROM TRADE) t INNER JOIN (SELECT sym , bid FROM QUOTE) q ON t.price>q.bid
Routing and Labels
This section provides information on how to use routing and labels with the SQL2 API.
Enabling the API
By default, this API is only enabled in query environments. If you want to enable it in non query environments, you need to set the KXI_ALLOWED_SBX_APIS
in the RCs and DAPs where you want to enable it. For more information, see the Database configuration section.
Routing Decisions
Routing decisions across multiple assemblies are determined by specifying the desired assembly labels in the WHERE
clause of the SQL query. Alternatively, you can use the scope.assembly
parameter to target an individual assembly. For more information, see the Scope section.
Examples
Consider the following assemblies with their respective name
and labels
sections:
Labels in Queries
In a query, labels should be referenced with a label_
prefix. For example, the label class
would be referenced as label_class
in the query.
# Assembly tsx-equity.
name: tsx_eq
labels:
exchange: tsx
class: equity
# Assembly tsx-futures.
name: tsx_fut
labels:
exchange: tsx
class: futures
# Assembly nyse-equity.
name: nyse_eq
labels:
exchange: nyse
class: equity
# Assembly nyse-futures.
name: nyse_fut
labels:
exchange: nyse
class: futures
# Assembly lse-equity.
name: lse_eq
labels:
exchange: lse
class: equity
# Assembly lse-futures.
name: lse_fut
labels:
exchange: lse
class: futures
The following table shows how different queries target different assemblies:
Query | scope.assembly | Assemblies | Explanation |
---|---|---|---|
SELECT ... |
All | No where clause defaults to all. | |
SELECT ... WHERE label_exchange='tsx' |
tsx-equity , tsx-futures |
One exchange, all classes. | |
SELECT ... WHERE label_exchange='nyse' and class='equity' |
nyse-equity |
Specify both exchange and class. | |
SELECT ... WHERE label_exchange in ('tsx', 'lse') or class<>'futures' |
All except nyse-futures |
Logically equivalent to not(label_exchange='nyse' and class='futures') . |
|
SELECT ... |
tsx_eq |
tsx_eq |
Uniquely target assembly with specified name. |
SELECT ... WHERE label_exchange='tsx' |
tsx_eq |
tsx_eq |
As above. The labels of tsx-eq are more than required, but they are consistent with the labels of tsx . |
SELECT ... WHERE label_exchange='tsx' |
nyse_eq |
None (failure) | Label does not match nyse_eq . |
Labels can be used to target as many or as few assemblies as required. They also don’t need to be consistent across assemblies, allowing the creation of new assemblies with different labels within a running system. For example, if we deploy the following assembly to the above set:
# New assembly with different labels.
labels:
dev: john_doe
Then a query of the form SELECT ... WHERE label_dev='john_doe'
targets only this new assembly.