Splayed tables
Medium-sized tables (up to 100 million rows) are best stored on disk splayed: each column is stored as a separate file, rather than using a single file for the whole table.
quotes/
├── .d
├── price
├── sym
└── time
The hidden file .d lists the columns in the order they appear in the table.
Tables that have many columns are good candidates for splaying, as most queries access only a small subset of those columns.
To save a table splayed, use set with the file handle in the left operand having a trailing slash to indicate saving to a directory.
q)`:dirname/ set table
`:dirname/
The table must be
- fully enumerated
- simple, not keyed
Example:
q)m1:1000000
q)q:86 91 81 55 / country dialling codes
q)edpn:{10000000+x?80000000} / eight-digit phone numbers
q)/a million calls with times, origin and destination codes and numbers
q)3#calls:([]tim:m1?23:59; occ:m1?q; ono:edpn m1; dcc:m1?q; dno:edpn m1)
tim occ ono dcc dno
-------------------------------
09:30 91 56809216 55 10241715
18:16 81 31860713 81 19766096
19:46 55 22477903 55 19899746
q)`:calls/ set calls
`:calls/
calls/
├── dcc
├── dno
├── occ
├── ono
└── tim
$ ls -a calls
. .. .d dcc dno occ ono tim
Nested columns
Nested columns are those where the column isn't a vector. In a splayed table, a nested column is represented in the filesystem by at least two files: one bears the column name, while the second file uses the same name suffixed with a #.
If the nested column contains symbols, a third file will be present with a ## suffix. This file is used to enumerate the symbols referenced in the column.
The type of the column, when read back in, is going to be anymap (type 77h), used for memory-mapped nested columns.
A common question in designing a table schema is whether to represent text as strings (nested) or symbols (vector). Where many values are repeated, as in stock-exchange codes, symbols have important advantages:
- atomic semantics in code
- compact storage
- fast execution
With fewer repeated values, these advantages dwindle, but the penalty of a bloated sym list remains. Fields such as comments or notes should always be strings.
Enumerating symbol columns
If a table contains columns of type symbol where the column is not enumerated, trying to save it splayed will result in a type error.
q)syms:`AMD`MSFT
q)ndates:5
q)dates:2019.10.03+til ndates
q)inits:24.5 27.37
q)(ends1;ends2):{c1:-0.25+ndates?0.5; c2:0f, -0.05+(ndates-1)?0.1; x+\raze flip(c2;c1)}each inits
q)((opens1;closes1);(opens2;closes2)):(2 0N)#/:(ends1;ends2)
q)tr:([]date:raze 2#enlist dates; open:opens1,opens2; close:closes1,closes2; sym:raze ndates#/:syms)
q)update high:((2*ndates)?1f)+open|close, low:(neg(2*ndates)?1f)+open&close from `tr
q)tr:`date xasc `date`open`high`low`close`sym xcols tr
q)tr
date open high low close volume sym
-------------------------------------------------
2020.10.03 24.5 24.51 23.79 24.13 19087300 AMD
2020.10.03 27.37 27.48 27.21 27.37 39386200 MSFT
2020.10.04 24.1 25.1 23.95 25.03 17869600 AMD
2020.10.04 27.39 27.96 27.37 27.94 82191200 MSFT
..
q)`:tr/ set tr
'type
Tables splayed across a directory must be fully enumerated and not keyed.
The solution is to enumerate symbol columns before saving the table splayed. This is done with the function .Q.en.
q).Q.en[`:dir] tr
date open high low close volume sym
-------------------------------------------------
2020.10.03 24.5 24.51 23.79 24.13 19087300 AMD
2020.10.03 27.37 27.48 27.21 27.37 39386200 MSFT
2020.10.04 24.1 25.1 23.95 25.03 17869600 AMD
2020.10.04 27.39 27.96 27.37 27.94 82191200 MSFT
..
This assigns to the variable sym the list of unique symbols in the table:
q)sym
`AMD`MSFT
It also creates the directory dir with a file in it, named sym, with the same contents:
q)\ls dir
"sym"
q)value `:dir/sym
`AMD`MSFT
Finally, it returns a table where columns of symbols are enumerated.
q)trenum: .Q.en[`:dir] tr
q)trenum.sym
`sym$`AMD`MSFT`AMD`MSFT`AMD`MSFT`AMD`MSFT`AMD`MSFT`AMD`MSFT`AMD`MSFT`AMD`MSFT
The value of this column differs from that of the original table tr:
q)tr.sym
`AMD`MSFT`AMD`MSFT`AMD`MSFT`AMD`MSFT`AMD`MSFT`AMD`MSFT`AMD`MSFT`AMD`MSFT
The enumerated table can now be saved splayed.
q)`:dir/tr/ set trenum
`:dir/tr/
The columns are saved separately, one per file:
q)\ls -a dir/tr
,"."
".."
".d"
"close"
"date"
"high"
"low"
"open"
"sym"
"volume"
This can also be done in a single step, without saving the enumerated table into a variable:
q)`:dir/tr/ set .Q.en[`:dir] tr
`:dir/tr/
Notice that the sym list is stored separately from the table. If the symbols are common to several tables it may be convenient to write the sym list in their common parent directory.
db/
├── sym
├── quotes/
└── tr/
Since table tr has a column sym, represented by file db/tr/sym, one place you cannot write the sym list is in db/tr.
Loading splayed tables
There are various ways to load or read a splayed table.
- Start q with the directory:
q calls
KDB-X 0.1.2 2025.10.18 Copyright (C) 1993-2025 Kx Systems
..
q).z.f
`calls
q)calls
tim occ ono dcc dno
-------------------------------
17:26 55 13464625 55 47918725
10:26 86 15538442 91 72388386
23:23 81 50209480 91 88260320
..
- Load the directory:
q)\l calls
`calls
q)load`calls
`calls
‘Loading’ the table in fact maps it to memory. None of it is written into memory by the load itself.
Warning
Going into the directory of the splayed table and loading from there with \l . will not only not work, it will also delete all variables in the main namespace in the current session!
- Get the table:
q)get`:calls
tim occ ono dcc dno
-------------------------------
17:26 55 13464625 55 47918725
10:26 86 15538442 91 72388386
23:23 81 50209480 91 88260320
..
Symbol columns in a splayed table are stored as enumerations of a list sym, stored separately from the table.
Retrieving the table without the sym list leaves its symbol columns as bare enumerations.
q)get `:db/tr
date time vol sym price
-----------------------------------
2020.07.03 11:23 12378 0 23.57101
2020.07.02 11:36 63682 1 24.06836
..
q)load `:db/sym
`sym
q)get `:db/tr
date time vol sym price
------------------------------------
2020.07.03 11:23 12378 APPL 23.57101
2020.07.02 11:36 63682 MSFT 24.06836
..
When q loads a directory it loads everything in it that represents a KDB-X object. So it is convenient to store a shared sym list in the root directory of a database.
db/
├── sym
├── quotes/
└── tr/
q db
KDB-X 0.1.2 2025.10.18 Copyright (C) 1993-2025 Kx Systems
..
q)sym
`APPL`MSFT`AMD`IBM
q)tr
date time vol inst price
------------------------------------
2020.07.03 11:23 12378 APPL 23.57101
2020.07.02 11:36 63682 MSFT 24.06836
..
Upserting into splayed tables
Sometimes we'd only like to grow the table by a few rows at a time. For this, we can use the built-in upsert keyword but with the same requirements as creating a new table from scratch, meaning we'll have to add the potential symbol columns to our enumeration:
q)\l .
q)trn:(date:2020.11.01;open:24.5 27.47;high:24.78 27.98;low:24.23 27.12;close:24.45 27.55;volume:20342454 63421434;sym:`AMD`MSFT)
q)`:tr/ upsert .Q.en[`:.;trn]
`:tr/
Changing the schema of splayed tables
Some keywords do not work with splayed tables, in particular those that would change the schema. We can still work around this limitation if needed by changing the underlying files that make up the table.
Consider the following table:
q)tr
date open high low close volume sym
------------------------------------------------
2020.10.03 24.5 24.51 23.79 24.13 19087300 AMD
2020.10.03 27.37 27.48 27.21 27.37 39386200 MSFT
..
Removing a column is a simple matter:
q)delete volume from tr
date open high low close sym
---------------------------------------
2020.10.03 24.5 24.51 23.79 24.13 AMD
2020.10.03 27.37 27.48 27.21 27.37 MSFT
..
But that does not work with a splayed table:
q)trade
date open high low close volume sym
------------------------------------------------
2020.10.03 24.5 24.51 23.79 24.13 19087300 AMD
2020.10.03 27.37 27.48 27.21 27.37 39386200 MSFT
..
q)delete volume from `trade
'splay
q)trade: delete volume from trade
'splay
The .d file contains the schema of a splayed database:
q)value `:trade/.d
`date`open`high`low`close`volume`sym
To remove a column from the schema, we can simply remove the column name from this file:
q).[`:trade/.d;();:;`date`open`high`low`close`sym]
`:trade/.d
For the changes to take effect we need to reload the splayed table.
q dir
KDB-X 0.1.2 2025.10.18 Copyright (C) 1993-2025 Kx Systems
..
q)\v
`s#`sym`trade
q)trade
date open high low close sym
---------------------------------------
2020.10.03 24.5 24.51 23.79 24.13 AMD
..
Notice that the file with the volume column has not been deleted. It is just not being used.
q)\ls trade
"close"
"date"
"high"
"low"
"open"
"sym"
"volume"
Adding a new column is similar: save the column contents as a file in the directory of the splayed table and update the file .d.
q)@[`:trade;`newcol;:;til 8]
`:trade
q).[`:trade/.d;();,;`newcol]
`:trade/.d
Verify the changes by loading the splayed table into a q session.
q dir
KDB-X 0.1.2 2025.10.18 Copyright (C) 1993-2025 Kx Systems
..
q)trade
date open high low close sym newcol
----------------------------------------------
2020.10.03 24.5 24.51 23.79 24.13 AMD 0
2020.10.03 27.37 27.48 27.21 27.37 MSFT 1
2020.10.04 24.1 25.1 23.95 25.03 AMD 2
..
Splaying large files
Flat tables
Partitioned tables
Segmented databases
Q for Mortals
§11.3 Splayed Tables
Q for Mortals
§14.2 Splayed Tables