Send Feedback
Skip to content

Splayed tables

Medium-sized tables (up to 100 million rows) are best stored on disk splayed: each column is stored as a separate file, rather than using a single file for the whole table.

quotes/
├── .d
├── price
├── sym
└── time

The hidden file .d lists the columns in the order they appear in the table.

Tables that have many columns are good candidates for splaying, as most queries access only a small subset of those columns.

To save a table splayed, use set with the file handle in the left operand having a trailing slash to indicate saving to a directory.

q)`:dirname/ set table
`:dirname/

The table must be

  • fully enumerated
  • simple, not keyed

Example:

q)m1:1000000
q)q:86 91 81 55 / country dialling codes
q)edpn:{10000000+x?80000000} / eight-digit phone numbers

q)/a million calls with times, origin and destination codes and numbers
q)3#calls:([]tim:m1?23:59; occ:m1?q; ono:edpn m1; dcc:m1?q; dno:edpn m1)
tim   occ ono      dcc dno
-------------------------------
09:30 91  56809216 55  10241715
18:16 81  31860713 81  19766096
19:46 55  22477903 55  19899746

q)`:calls/ set calls
`:calls/
calls/
├── dcc
├── dno
├── occ
├── ono
└── tim
$ ls -a calls
.   ..  .d  dcc dno occ ono tim

Nested columns

Nested columns are those where the column isn't a vector. In a splayed table, a nested column is represented in the filesystem by at least two files: one bears the column name, while the second file uses the same name suffixed with a #.

If the nested column contains symbols, a third file will be present with a ## suffix. This file is used to enumerate the symbols referenced in the column.

The type of the column, when read back in, is going to be anymap (type 77h), used for memory-mapped nested columns.

A common question in designing a table schema is whether to represent text as strings (nested) or symbols (vector). Where many values are repeated, as in stock-exchange codes, symbols have important advantages:

  • atomic semantics in code
  • compact storage
  • fast execution

With fewer repeated values, these advantages dwindle, but the penalty of a bloated sym list remains. Fields such as comments or notes should always be strings.

Enumerating symbol columns

If a table contains columns of type symbol where the column is not enumerated, trying to save it splayed will result in a type error.

q)syms:`AMD`MSFT
q)ndates:5
q)dates:2019.10.03+til ndates
q)inits:24.5 27.37
q)(ends1;ends2):{c1:-0.25+ndates?0.5; c2:0f, -0.05+(ndates-1)?0.1; x+\raze flip(c2;c1)}each inits
q)((opens1;closes1);(opens2;closes2)):(2 0N)#/:(ends1;ends2)
q)tr:([]date:raze 2#enlist dates; open:opens1,opens2; close:closes1,closes2; sym:raze ndates#/:syms)
q)update high:((2*ndates)?1f)+open|close, low:(neg(2*ndates)?1f)+open&close from `tr
q)tr:`date xasc `date`open`high`low`close`sym xcols tr

q)tr
date       open  high  low   close volume    sym
-------------------------------------------------
2020.10.03 24.5  24.51 23.79 24.13 19087300  AMD
2020.10.03 27.37 27.48 27.21 27.37 39386200  MSFT
2020.10.04 24.1  25.1  23.95 25.03 17869600  AMD
2020.10.04 27.39 27.96 27.37 27.94 82191200  MSFT
..
q)`:tr/ set tr
'type

Tables splayed across a directory must be fully enumerated and not keyed.

The solution is to enumerate symbol columns before saving the table splayed. This is done with the function .Q.en.

q).Q.en[`:dir] tr
date       open  high  low   close volume    sym
-------------------------------------------------
2020.10.03 24.5  24.51 23.79 24.13 19087300  AMD
2020.10.03 27.37 27.48 27.21 27.37 39386200  MSFT
2020.10.04 24.1  25.1  23.95 25.03 17869600  AMD
2020.10.04 27.39 27.96 27.37 27.94 82191200  MSFT
..

This assigns to the variable sym the list of unique symbols in the table:

q)sym
`AMD`MSFT

It also creates the directory dir with a file in it, named sym, with the same contents:

q)\ls dir
"sym"
q)value `:dir/sym
`AMD`MSFT

Finally, it returns a table where columns of symbols are enumerated.

q)trenum: .Q.en[`:dir] tr
q)trenum.sym
`sym$`AMD`MSFT`AMD`MSFT`AMD`MSFT`AMD`MSFT`AMD`MSFT`AMD`MSFT`AMD`MSFT`AMD`MSFT

The value of this column differs from that of the original table tr:

q)tr.sym
`AMD`MSFT`AMD`MSFT`AMD`MSFT`AMD`MSFT`AMD`MSFT`AMD`MSFT`AMD`MSFT`AMD`MSFT

The enumerated table can now be saved splayed.

q)`:dir/tr/ set trenum
`:dir/tr/

The columns are saved separately, one per file:

q)\ls -a dir/tr
,"."
".."
".d"
"close"
"date"
"high"
"low"
"open"
"sym"
"volume"

This can also be done in a single step, without saving the enumerated table into a variable:

q)`:dir/tr/ set .Q.en[`:dir] tr
`:dir/tr/

Notice that the sym list is stored separately from the table. If the symbols are common to several tables it may be convenient to write the sym list in their common parent directory.

db/
├── sym
├── quotes/
└── tr/

Since table tr has a column sym, represented by file db/tr/sym, one place you cannot write the sym list is in db/tr.

Working with sym files

Loading splayed tables

There are various ways to load or read a splayed table.

  • Start q with the directory:
q calls
KDB-X 0.1.2 2025.10.18 Copyright (C) 1993-2025 Kx Systems
..

q).z.f
`calls
q)calls
tim   occ ono      dcc dno
-------------------------------
17:26 55  13464625 55  47918725
10:26 86  15538442 91  72388386
23:23 81  50209480 91  88260320
..
  • Load the directory:
q)\l calls
`calls

q)load`calls
`calls

‘Loading’ the table in fact maps it to memory. None of it is written into memory by the load itself.

Warning

Going into the directory of the splayed table and loading from there with \l . will not only not work, it will also delete all variables in the main namespace in the current session!

  • Get the table:
q)get`:calls
tim   occ ono      dcc dno
-------------------------------
17:26 55  13464625 55  47918725
10:26 86  15538442 91  72388386
23:23 81  50209480 91  88260320
..

Symbol columns in a splayed table are stored as enumerations of a list sym, stored separately from the table.

Retrieving the table without the sym list leaves its symbol columns as bare enumerations.

q)get `:db/tr
date       time  vol   sym price
-----------------------------------
2020.07.03 11:23 12378 0   23.57101
2020.07.02 11:36 63682 1   24.06836
..
q)load `:db/sym
`sym
q)get `:db/tr
date       time  vol   sym  price
------------------------------------
2020.07.03 11:23 12378 APPL 23.57101
2020.07.02 11:36 63682 MSFT 24.06836
..

When q loads a directory it loads everything in it that represents a KDB-X object. So it is convenient to store a shared sym list in the root directory of a database.

db/
├── sym
├── quotes/
└── tr/
q db
KDB-X 0.1.2 2025.10.18 Copyright (C) 1993-2025 Kx Systems
..

q)sym
`APPL`MSFT`AMD`IBM
q)tr
date       time  vol   inst price
------------------------------------
2020.07.03 11:23 12378 APPL 23.57101
2020.07.02 11:36 63682 MSFT 24.06836
..

Upserting into splayed tables

Sometimes we'd only like to grow the table by a few rows at a time. For this, we can use the built-in upsert keyword but with the same requirements as creating a new table from scratch, meaning we'll have to add the potential symbol columns to our enumeration:

q)\l .
q)trn:(date:2020.11.01;open:24.5 27.47;high:24.78 27.98;low:24.23 27.12;close:24.45 27.55;volume:20342454 63421434;sym:`AMD`MSFT)
q)`:tr/ upsert .Q.en[`:.;trn]
`:tr/

Changing the schema of splayed tables

Some keywords do not work with splayed tables, in particular those that would change the schema. We can still work around this limitation if needed by changing the underlying files that make up the table.

Consider the following table:

q)tr
date       open  high  low   close volume   sym
------------------------------------------------
2020.10.03 24.5  24.51 23.79 24.13 19087300 AMD
2020.10.03 27.37 27.48 27.21 27.37 39386200 MSFT
..

Removing a column is a simple matter:

q)delete volume from tr
date       open  high  low   close sym
---------------------------------------
2020.10.03 24.5  24.51 23.79 24.13 AMD
2020.10.03 27.37 27.48 27.21 27.37 MSFT
..

But that does not work with a splayed table:

q)trade
date       open  high  low   close volume   sym
------------------------------------------------
2020.10.03 24.5  24.51 23.79 24.13 19087300 AMD
2020.10.03 27.37 27.48 27.21 27.37 39386200 MSFT
..
q)delete volume from `trade
'splay
q)trade: delete volume from trade
'splay

The .d file contains the schema of a splayed database:

q)value `:trade/.d
`date`open`high`low`close`volume`sym

To remove a column from the schema, we can simply remove the column name from this file:

q).[`:trade/.d;();:;`date`open`high`low`close`sym]
`:trade/.d

For the changes to take effect we need to reload the splayed table.

q dir
KDB-X 0.1.2 2025.10.18 Copyright (C) 1993-2025 Kx Systems
..
q)\v
`s#`sym`trade
q)trade
date       open  high  low   close sym
---------------------------------------
2020.10.03 24.5  24.51 23.79 24.13 AMD
..

Notice that the file with the volume column has not been deleted. It is just not being used.

q)\ls trade
"close"
"date"
"high"
"low"
"open"
"sym"
"volume"

Adding a new column is similar: save the column contents as a file in the directory of the splayed table and update the file .d.

q)@[`:trade;`newcol;:;til 8]
`:trade
q).[`:trade/.d;();,;`newcol]
`:trade/.d

Verify the changes by loading the splayed table into a q session.

q dir
KDB-X 0.1.2 2025.10.18 Copyright (C) 1993-2025 Kx Systems
..

q)trade
date       open  high  low   close sym  newcol
----------------------------------------------
2020.10.03 24.5  24.51 23.79 24.13 AMD  0
2020.10.03 27.37 27.48 27.21 27.37 MSFT 1
2020.10.04 24.1  25.1  23.95 25.03 AMD  2
..

Splaying large files
Flat tables
Partitioned tables
Segmented databases
Q for Mortals §11.3 Splayed Tables Q for Mortals §14.2 Splayed Tables