Linking columns
A link column is similar to a foreign key – it links the values of a column in a table to the values in a column in a second table.
The difference between them is that a foreign-key column is an enumeration over the key column of a keyed table, while a link column consists of indexes into an arbitrary column of an arbitrary table.
A link column is useful where a key column is not available. For example:
- A table can contain a link to itself to represent a parent-child relationship.
- Links can represent ‘foreign-key’ relationships between splayed tables, which cannot be keyed.
Overview
- Simple linked columns
- Linked columns in splayed tables
- Compound linked columns
- Linked columns in partitioned tables
- Linking across multiple KDB-X databases
- Summary
Simple linked columns
Using the financial data table as before, and a table of equity position data, we can establish a mapping similar to that used for complex foreign keys.
In this case, since the financials table remains unkeyed, the mapping is achieved by creating an index of integers that serve as lookup references. We use the Enumeration operator ! to establish the connection once we have mapped each row in the referencing table equityPositions to the corresponding row number in the referenced table financials:
q)equityPositions:([]sym:5#`A`B`C`D`E;size:5?10000;mtm:5?2.)
q)//Look up where the entries in the symbol column correspond to the
q)//rows in the now unkeyed ‘financials’ table, then store the
q)//references in the column ‘finLink’
q)financials:0!financials
q)update finLink:`financials!financials.sym?sym from `equityPositions
Much as before, the finLink column in the equityPositions table is identified as a foreign key to the financials table within the table metadata (even though it is not strictly a foreign key as before) and select, exec, update, and delete statements incorporating dot notation may again be used.
Appending additional rows to the equityPositions table must maintain the link to the financials table by providing the finLink column with the row index in the financials table that will be mapped to in each case. In contrast to a foreign-key mapping, no enumeration is present and there are therefore no restrictions on what row numbers are inserted:
q)`equityPositions insert (`A;200;2.;`financials!0)
,5
q)equityPositions
sym size mtm finLink
---------------------------
A 6927 1.266082 0
B 3700 1.150539 1
C 5588 0.01802349 2
D 5607 0.2896114 3
E 1666 1.541226 3
A 200 2 0
q)//Insert a sym not in the financials table, link to column 0
q)`equityPositions insert (`S;200;2.;`financials!0)
,6
q)//Insert the same sym again, link to an index not in the financials table
q)`equityPositions insert (`S;200;2.;`financials!6)
,7
q)equityPositions
sym size mtm finLink
------------------------------
A 6927 1.266082 0
B 3700 1.150539 1
C 5588 0.01802349 2
D 5607 0.2896114 3
E 1666 1.541226 3
A 200 2 0
S 200 2 0
S 200 2 6
q)//Unmapped data results in missing entries
q)select sym,finLink.earningsPerShare from equityPositions
sym earningsPerShare
--------------------
A
B 2.3
C 1.5
D 1.3
E
A 2.3
S 2.3
S
Linked columns in splayed tables
It is possible to create linked columns on tables that have already been splayed to disk.
//Create two tables and splay to disk
companyInfo:([]
sym:`a`b`c`d;
exchange:`NYSE`NDQ`NYSE`TSE;
sector:4?("Banking";"Retail";"Food Producers");
MarketCap:30000000+4?1000000 )
q)`:db/companyInfo/ set .Q.en[`:db] companyInfo
`:db/companyInfo/
q)t:([]sym:`a`b`c`a`b; ex:`NYSE`NDQ`LSE`NYSE`NDQ; price:5?100.)
q)`:db/t/ set .Q.en[`:db] t
`:db/t/
q)//Create a new column in ‘t’ linking to ‘companyInfo’ via the sym column
q)`:db/t/cLink set `companyInfo!(companyInfo`sym)?(t`sym)
`:db/t/cLink
//Update the .d file on disk so that it picks up the new column
q).[`:db/t/.d;();,;`cLink]
`:db/t/.d
q)get `:db/t/.d
`sym`ex`price`cLink
q)//Load the table to update the changes in memory
q)\l db/t
//Sample query
q)select sym,cLink.sector,cLink.MarketCap from t
sym sector MarketCap
-----------------------
a "Retail" 30886470
b "Banking" 30230906
c "Retail" 30352036
a "Retail" 30886470
Compound linked columns
Only a small adjustment to the single-column case is required to link tables together based on multiple columns. We demonstrate this by continuing with the above tables:
q)//We need to reload companyInfo otherwise the link example below will
q)//not execute properly since the enumerated sym columns from the
q)//splayed table ‘t’ have type 20h rather than 11h
q)\l db/companyInfo
q)//Initiate the mapping by flipping the columns to lists and searching
q)//on each sym/exchange combination
q)`:db/t/cLink2 set `companyInfo!(flip companyInfo`sym`exchange)?flip t`sym`ex
`:db/t/cLink2
//Update the .d file and reload the table once again
q).[`:db/t/.d;();,;`cLink2]
q)\l db/t
//Sample query, double link means sym 'c' does not map this time
q)select sym,cLink2.sector,cLink2.MarketCap from t
sym sector MarketCap
-----------------------
a "Banking" 30450974
b "Retail" 30909716
c ""
a "Banking" 30450974
b "Retail" 30909716
Linked columns in partitioned tables
Partitioned tables can have link columns provided the links do not span partitions. In particular, you cannot link across days for a table partitioned by date.
Creating a link column in a partitioned table is best done as each partition is written. The process then reduces to that for splayed tables.
Create a link between non-symbol columns in the simple partitioned tables t1 and t2.
First, create the first day’s tables with the link and save them to a partition.
q)t1:([] id:101 102 103; v:1.1 2.2 3.3)
q)t2:([] t1link:`t1!t1[`id]?103 101 101 102; n:10 20 30 40)
q)`:temp/db/2019.01.01/t1/ set t1
`:temp/db/2019.01.01/t1/
q)`:temp/db/2019.01.01/t2/ set t2
`:temp/db/2019.01.01/t2/
Do the same for the second day.
q)t1:([] id:104 105; v:4.4 5.5)
q)t2:([] t1link:`t1!t1[`id]?105 104 104; n:50 60 70)
q)`:temp/db/2019.01.02/t1/ set t1
`:temp/db/2019.01.02/t1/
q)`:temp/db/2019.01.02/t2/ set t2
`:temp/db/2019.01.02/t2/
Finally, restart KDB-X, map the tables and execute a query across the link.
$ q
KDB-X 0.1.2 2025.10.18 Copyright (C) 1993-2025 Kx Systems
q)\l temp/db
q)select date,n,t1link.v from t2 where date within 2019.01.01 2019.01.02
date n v
-----------------
2019.01.01 10 3.3
2019.01.01 20 1.1
2019.01.01 30 1.1
2019.01.01 40 2.2
2019.01.02 50 5.5
2019.01.02 60 4.4
2019.01.02 70 4.4
The final example is similar except that it creates a link over enumerated symbol columns.
q)/ day 1
q)t1:([] c1:`c`b`a; c2: 10 20 30)
q)`:temp/db/2019.01.01/t1/ set .Q.en[`:temp/db/; t1]
`:temp/db/2019.01.01/t1/
q)t2:([] c3:`a`b`a`c; c4: 1. 2. 3. 4.)
q)`:temp/db/2019.01.01/t2/ set .Q.en[`:temp/db/; update t1link:`t1!t1[`c1]?c2 from t2]
`:temp/db/2019.01.01/t2/
q)/ day 2
q)t1:([] c1:`d`a; c2: 40 50)
q)`:temp/db/2019.01.02/t1/ set .Q.en[`:temp/db/; t1]
`:temp/db/2019.01.02/t1/
q)t2:([] c3:`d`a`d; c4:5. 6. 7.)
q)`:temp/db/2019.01.02/t2/ set .Q.en[`:temp/db/; update t1link:`t1!t1[`c1]?c2 from t2]
`:temp/db/2019.01.02/t2/
q)/ remap
q)\l temp/db
q)select c3,t1link.c2,c4 from t2 where date within 2019.01.01 2019.01.02
c3 c2 c4
--------
a 30 1
b 20 2
a 30 3
c 10 4
d 40 5
a 50 6
d 40 7
A link column with domain of a partitioned table requires the encompassing table to be partitioned too.
Signals a par error since 4.1t 2022.04.15.
q).Q.dd[`:/tmp/db1;`2022.01.01`ecfmapping`] set .Q.en[`:/tmp/db1] ([]firmName:enlist "DUMMY")
q)\l /tmp/db1
q)select ecfmap.firmName from ([id:1 2];ecfmap:`ecfmapping!1 1)
'par
[0] select ecfmap.firmName from ([id:1 2];ecfmap:`ecfmapping!1 1)
Since 4.1t 2023.08.04,4.0 2023.08.11 references to linked columns under group by no longer require remapping the foreign column for every group.
Linking across multiple KDB-X databases
For practical purposes, KDB-X allows only one on-disk database to be memory-mapped to a process at any given time. Occasionally, it may be necessary to perform analytics across multiple databases simultaneously. Although it is possible to aggregate data from from various sources to a centralized location via IPC, this approach becomes impractical when working with large datasets spanning multiple days or weeks. In Unix-based operating systems such as Linux and macOS an alternative is to use symbolic links in conjunction with KDB-X linked columns. This method enables efficient retrieval and analysis of vast amounts of data while maintaining acceptable levels of RAM usage.
The following section outlines how to link a trade table in one partitioned database to a quote table in a separate database located on the same file network. You can easily generalize this method to link multiple tables across multiple databases. To do so, follow these three main steps:
- Create a partitioned database that contains the relevant tables across multiple dates. Include utility functions to support the creation of database links.
- Map the entries in the
tradetable to those in thequotetable for each date using a standard as-of join acrosstimeandsymcolumns. - Append a column linking the
tradetable to thequotetable based on this mapping and save this to disk.
Use .Q.ens and .Q.dpfts that accept an additional sym table argument instead of .Q.en and .Q.dpft to ensure that no sym file clashes occur across the two databases.
The following code defines the trade and quote tables and writes them to disk in databases db1 and db2 respectively:
//Table schemas, being with fresh trade and quote schemas
trade:([]
time:`time$();
sym:`$();
price:`float$();
size:`int$() );
quote:([]
time:`time$();
sym:`$();
bid:`float$();
bsize:`int$();
ask:`float$();
asize:`int$() )
//Number of entries in trade table
n:10000
//Start and end of day
st:08:00:00.000
et:17:00:00.000
syms:`A`B`C`D
tdata:(asc st+n?et-st;n?syms;n?100f;n?1000)
insert[`trade;tdata]
//Generate 10x number of quotes
n*:10
qdata:(asc st+n?et-st;n?syms;n?100f;n?1000;n?100f;n?1000)
insert[`quote;qdata]
//Historical database builder function
buildHDB:{[dir;dt;t] .Q.dpft[dir;dt;`sym;t];}
//Partition the tables to disk
buildHDB[`:db1;;`trade] each .z.D-til 3
buildHDB[`:db2;;`quote] each .z.D-til 3
The first utility function creates a symlink in a directory basePath to a table rTab that exists in a remote directory remotePath. The symlink name will be the same name as rTab. For our purposes
basePathis the path to eachtradetableremotePaththe path to eachquotetablerTabis thequotetable name
All arguments are passed as strings.
//Check if the symlink exists and use Unix command ‘ln –s’ to create
//the symlink if not
.lc.createSymLink:{[basePath;remotePath;rTab]
(baseTablePath;remoteTablePath):{[path;rTab]
raze system "realpath ",path,"/",rTab}[;rTab] each (basePath;remotePath);
if[not(`$rTab) in key hsym `$basePath;
system "ln -s ",remoteTablePath," ",baseTablePath]; }
The next utility function maps the entries in the trade table to those in the quote table using an as-of join, constructs a link between the two tables and saves to disk using .Q.dpfts. The as-of join columns (usually time and sym) are passed in as a symbol list whereas the file path and table names are passed in as strings.
.lc.joinSaveTables:{[ajCols;basePath;dt;baseTable;remoteTable]
//Cast table names to syms for convenience
remoteTable:`$remoteTable;
baseTable:`$baseTable;
//Force-load the asof join columns from remote table into memory
remoteFileHandle:` sv (hsym `$string dt),remoteTable;
//We have to load the db to correctly get the sym file
pwd:raze system "pwd";
system "l ",basePath;
remoteTable set select sym,time from (get remoteFileHandle);
//Re-apply attributes of original table to the in-memory copy
![remoteTable;();0b;a[`c]!{(#;enlist x;y)} .'
flip value a:exec a,c from meta get remoteFileHandle where c in ajCols];
//Join tables and set the link column to be the point at which the
//tables map together
baseTable set aj[
ajCols;
select from value baseTable;
?[value remoteTable; (); 0b; (ajCols!ajCols),(enlist `id)!enlist `i] ];
update link:remoteTable!
(exec i from select i from value remoteTable)?id from baseTable;
// Splay base table to disk but use different, independent sym file `tsym
system "cd ",pwd;
.Q.dpfts[hsym `$basePath;dt;`sym;baseTable;`tsym]; }
Now that we have all the prerequisite utility functions defined, if we create the partitioned databases and run something like the following, a link will be created across all database partitions.
basePath:"db1"
remotePath:"db2"
baseTable:"trade"
remoteTable:"quote"
{.lc.createSymLink[raze basePath,"/",string x;raze remotePath,"/",string x;remoteTable]} each .z.D-til 3
.lc.joinSaveTables[`sym`time;basePath;;baseTable;remoteTable]each asc .z.D-til 3
Loading the database from memory, we achieve a trade table with an embedded link to the remote quote table.
q)\l db1
q)tables[]
`quote`trade
q)meta trade
c | t f a
-----| ---------
date | d
sym | s p
time | t
price| f
size | i
id | j
link | i quote
Aggregations may be carried out using a single table; queries will be very efficient especially if repeated due to caching:
// First run
q)\ts res:select size wavg price,bsize wavg bid,asize wavg ask by sym,10 xbar time.minute from select time,sym,size,price,link.ask,link.asize,link.bid,link.bsize from trade where date=max date
7 8917312
// Second run
q)\ts res:select size wavg price,bsize wavg bid,asize wavg ask by sym,10 xbar time.minute from select time,sym,size,price,link.ask,link.asize,link.bid,link.bsize from trade where date=max date
6 8915888
q)res
sym minute| price bid ask
----------| --------------------------
A 08:00 | 50.87719 49.76909 48.63532
A 08:10 | 48.9346 49.99889 52.01281
A 08:20 | 47.48985 49.68657 53.01129
A 08:30 | 50.03407 51.82779 47.96814
Summary
In this document we learned about linked columns. We read about
- what linked columns are
- how to create them in different kinds of tables
- how to have multiple linked columns in a single table
- how to link different databases together
Foreign keys
Q for Mortals
§8.5 Foreign Keys and Virtual Columns