8. Tables
8.0 Overview
Tables are first class entities in q, meaning they are data structures that live in memory just like lists or dictionaries. A q table is a collection of named columns implemented as a dictionary. Consequently, q tables are column-oriented, in contrast to the row-oriented tables in relational databases.
Moreover, since lists are ordered so are columns, in contrast to SQL where the order of rows is undefined. The fact that q tables comprise ordered column lists makes kdb+ very efficient at storing, retrieving and manipulating sequential data. One important example is time series data.
kdb+ handles relational and time series data in the unified environment of q tables. There is no separate data definition language, no separate stored procedure language and no need to map internal representations to a separate form for persistence. Just q tables, expressions and functions.
Tables are built from dictionaries, so it behooves the cursory reader to review Chapter 5 before proceeding.
8.1 Table Definition
8.1.1 Review of Table as Column Dictionary
We summarize our findings on column dictionaries from 5.4. We began with a rectangular collection of named column lists.
q)show dc:([name:`Dent`Beeblebrox`Prefect; iq:98 42 126])
name| Dent Beeblebrox Prefect
iq | 98 42 126
q)dc[`iq;]
98 42 126
q)dc[;2]
name| `Prefect
iq | 126
q)dc[`iq; 2]
126
Transpose it with flip to get a table.
show t:flip ([name:`Dent`Beeblebrox`Prefect; iq:98 42 126])
name iq
--------------
Dent 98
Beeblebrox 42
Prefect 126
q)t[;`iq]
98 42 126
q)t[2;]
name| `Prefect
iq | 126
q)t[2;`iq]
126
We repeat our findings about a table defined this way.
-
It is a two dimensional data structure that uses an integer index in the first slot and a symbolic column name in the second slot.
-
Specifying only an integer in the first slot retrieves a section dictionary across that index – i.e., a record.
-
Specifying only a column name in the second slot retrieves that column.
-
Specifying both an integer and a column name retrieves the "field" at that row in that column.
-
A table is also logically a list of section dictionaries.
-
The only effect of flipping the column dictionary is to reverse the order of its indices; no data is rearranged under the covers.
All tables have type 98h.
type t
98h
The easy way to extract a table column is by eliding the row index. For table names only, you can omit the leading semicolon in the syntax.
t[;`iq]
98 42 126
q)t[`iq]
98 42 126
q)t[`name`iq]
Dent Beeblebrox Prefect
98 42 126
Since it is possible to retrieve values from a dictionary using dot notation, this is also true for tables.
q)t.name
`Dent`Beeblebrox`Prefect
Tip
Unfortunately this notation doesn't work inside functions so we recommend not using it.
8.1.2 Table Display
Observe that the row and column display of a table is indeed the transpose of the dictionary display. This reflects the transposed indices, even though the internal data layout is the same.
q)dc
name| Dent Beeblebrox Prefect
iq | 98 42 126
q)t
name iq
--------------
Dent 98
Beeblebrox 42
Prefect 126
Tip
For Luddites who prefer the more compact table display of q2.3, use .Q.s1 or 0N!. The latter suppresses nil items in its output for the truly fastidious.
q).Q.s1 ([] c1:`a`b`c; c2:10 20 30)
"+`c1`c2!(`a`b`c;10 20 30)"
q)0N!([] c1:`a`b`c; c2:10 20 30); / note trailing ;
+`c1`c2!(`a`b`c;10 20 30)
8.1.3 Table Definition Syntax
Constructing a table by flipping a column dictionary is useful to build the table on the fly since everything can be parameterized, including column names. There is an alternative syntax to define tables that makes things more convenient for simple tables.
([] c1:L1; ...; cn:Ln)
Here ci is a column name and Li is the corresponding list of column values. The Li are expressions that reduce to lists of equal count; they can be atoms as long as at least one is a list. The brackets are used to specify a primary key and will be explained in §8.3.3. For readability, in this tutorial we shall often include optional whitespace after the closing square bracket and to the right of semicolon separators.
Important
The semicolons in table definition syntax are not assignment. They are markers in the syntactic sugar that separate column names from column values.
Tip
Do not omit the square brackets. The interpreter will not complain and you will end up with a list, rather than a table, and the colons will be assignment.
Here is how to define our table using table definition syntax, which is arguably simpler.
q)t:([] name:`Dent`Beeblebrox`Prefect; iq:98 42 126)
q)t~flip `name`iq!(`Dent`Beeblebrox`Prefect;98 42 126)
1b
As with dictionary definition syntax, you can also make a table from variables and the column names will be taken from the variable names.
q)c1:`Dent`Beeblebrox`Prefect
q)c2:98 42 126
q)([] c1; c2)
c1 c2
--------------
Dent 98
Beeblebrox 42
Prefect 126
Tip
To see a sorted list of the names of all tables and keyed tables in the current workspace issue the commend \a. In a fresh q session:
q)t1:([] c1:`a`b; c2:10 20)
q)kt1:([k:`a`b] v:10 20)
q)\a
`kt1`t1
Any valid q expression can appear to the right of the colon in table definition syntax; it is evaluated as part of the table construction. You must ensure that the resulting column lists all have the same length.
q)([] c1:1+til 5; c2:5#42)
c1 c2
-----
1 42
2 42
3 42
4 42
5 42
Provided you specify at least one column as a list, atoms will be extended to match.
q)([] c1:`a`b`c; c2:42; c3:98.6)
c1 c2 c3
----------
a 42 98.6
b 42 98.6
c 42 98.6
Using an atom in this way might appear to assign a default value for subsequent insertions. It does not.
Tip
You cannot define a single row table using atoms. You must make a list of at least one of the atoms.
q)([] c1:`a; c2:100)
'rank
q)([] c1:enlist `a; c2:100)
c1 c2
------
a 100
q)([] c1:1#`a; c2:100)
c1 c2
------
a 100
Alternatively you can create a single row table by enlisting a record dictionary.
q)enlist ([c1:`a; c2:10])
c1 c2
-----
a 10
If you create a table as the flip of a column dictionary, item-wise extension of at atom is performed when the column dictionary is flipped into a table.
q)([c1:`a`b`c; c2:42; c3:1.1])
c1| `a`b`c
c2| 42
c3| 1.1
q)flip ([c1:`a`b`c; c2:42; c3:1.1])
c1 c2 c3
---------
a 42 1.1
b 42 1.1
c 42 1.1
If you omit column names(s) default name(s) will be created as `x then
`x1, `x2 ... as necessary.
q)([] 1 2; `a`b)
x x1
----
1 a
2 b
8.1.4 Table Metadata
The column names of a table can be retrieved as a list of symbols with cols.
q)cols t
`name`iq
The function meta can be applied to a table to retrieve its metadata. The result is a keyed table (see 8.4.1) with one record for each column in the original table.
- The key column
cof the result contains the column names. - The column
tcontains the type char of the column. - The column
fcontains the domain for a foreign key or link column. - The column
acontains attributes associated with the column.
q)meta t
c | t f a
----| -----
name| s
iq | j
Advanced
When (meta) displays an upper case type char for a column, this indicates that column is a compound list in which each
field is a simple list of the indicated type. Such tables arise, for example, when you group without aggregating in a query. Here is one created manually. Observe closely the upper case 'J' in the t column for column c2.
q)meta ([] c1:1 2 3; c2:(1 2; enlist 3; 4 5 6))
c | t f a
--| -----
c1| j
c2| J
The function tables takes a symbol that is the name of a context (see 11.1) and returns a sorted symbolic list with the names of tables in that context. For example, we list all tables in the root context in a fresh q session.
q)t2:([] c1:1 2 3; c2:(1 2; enlist 3; 4 5 6))
q)t:([] name:`Dent`Beeblebrox`Prefect; iq:98 42 126)
q)tables `.
`s#`t`t2
Alternatively, the command \a provides the same result. If no argument is provided, it returns the result for the current context.
8.1.5 Records
Since a table is logically a list of records, count returns the number of records. In our example,
q)count t
3
Now let's inspect the sequence of records.
q)t
name iq
--------------
Dent 98
Beeblebrox 42
Prefect 126
q)t[0]
name| `Dent
iq | 98
q)t[1]
name| `Beeblebrox
iq | 42
q)t[2]
name| `Prefect
iq | 126
Since a record slices across the table display horizontally, this motivates calling each record a table row. The record associates column names with the values in a physical row. To retrieve the naked values of a row - i.e., without the column names, simply apply value to the record dictionary.
q)value t[1]
`Beeblebrox
42
8.1.6 Flipped Column Dictionary vs. List of Records
Is a table a flipped column dictionary or a list of records? Logically it is both, but physically it is stored as a column dictionary. In fact, q dynamically recognizes a conforming list of dictionaries that could be formed into a table and reorganizes the data into columnar form automatically. It doesn't ask for permission nor seek forgiveness.
To verify this, we first create a list of non-conforming dictionaries that differ in the italicized key names. As expected it has type 0.
q)type (`name`iq!(`Dent;98); `nome`iq!(`Beeblebrox;42))
0h
Once we make the names agree so that the records conform, type tells us that q has indeed converted the list to a table.
q)type (`name`iq!(`Dent;98); `name`iq!(`Beeblebrox;42))
98h
q)(`name`iq!(`Dent;98); `name`iq!(`Beeblebrox;42))
name iq
-------------
Dent 98
Beeblebrox 42
This is no mere illusion. The data have been reorganized into columns.
Tip
Here is a useful special case of this phenomenon. Start with a simple dictionary and enlist it to create a singleton. What is this?
q)enlist `a`b!10 20
a b
-----
10 20
q)enlist ([a:10; b:20])
a b
-----
10 20
It is a table because it is a singleton list of records that conform. (This is a good q test question).
You might think that you don't want q to do this automatic reorganization, but you would be wrong. In general, column retrieval and manipulation will be significantly faster than operations on a list of dictionaries, especially if the columns are simple - i.e., they are stored contiguously.
Tip
A downside of tables being stored as columns is that row deletion is an expensive operation because all the column lists must be compressed to close the resulting gap. The best way to deal with this in large tables is not to do it. Instead of deleting a row, use a separate column that holds a flag indicating whether the row has been deleted and then exclude the "deleted" rows in queries. You can reorg the table and remove the deleted records during down time.
8.1.7 Pattern Matching
The table pattern can deconstruct a table into unadorned columns. See 10.4 for the full treatment of pattern matching.
q)t:([] c1:10 20 30; c2:98.6 99 99.5)
q)([]c1:qty;c2:px):t
q)qty
10 20 30
q)px
98.6 99 99.5
As with dictionaries, not all columns need be present in the pattern to match. Here is how to extract an naked column vector without using
exec.
q)([]c1:px):t
q)px
10 20 30
8.2 Empty Tables and Schema
We saw in the previous section that a table can be defined and populated in one step using table definition syntax.
q)t:([] name:`Dent`Beeblebrox`Prefect; iq:98 42 126)
Fully listing columns with literal values is usually done only with smaller tables - e.g., lookup tables. Large tables are usually created programmatically from computed data or data read from files or received over the wire.
In these circumstances, it is useful to create an empty table initially and then populate it later by appending in place. You could do this with general empty lists.
q)t:([] name:(); iq:())
All columns here are lists of general type, so data of any type can be appended in the first record. Should the first item being appended to an empty column be an atom (a common case), the column becomes a singleton list with that atom - i.e., a simple list of that type. As with all simple lists, type checking will thereafter be enforced for all inserts and updates to that column. Consequently all subsequent appends must match the initial one.
q)`t insert (`nuba; 150)
,0
q)meta t
c | t f a
----| -----
name| s
iq | j
This situation is fine provided you are guaranteed that all data will be match the initial record. Unfortunately, real world data can be unpredictable. Should the first record be bad, type checking will reject all subsequent good data.
q)`t insert (`devi; 155.5)
'type
Recommendation: It is good practice to specify the types of all columns in an empty table. In the table definition, cast an empty list to the appropriate type.
q)meta ([] name:`symbol$(); iq:`long$())
c | t f a
----| -----
name| s
iq | j
A shorter, and slightly less obvious technique is the following.
q)([] name:0#`; iq:0#0)
name iq
-------
q)([] name:0#`; iq:0#0)~ ([] name:`symbol$(); iq:`long$())
1b
Whichever form you use, it ensures that only data of the appropriate type can be appended.
Tip
When someone asks for the schema of your table, simply reply with meta t and hand them 0#t.
q)meta 0#([] c1:1 2; c2:`a`b)
c | t f a
--| -----
c1| j
c2| s
8.3 Basic select and update
We shall cover qSQL in depth in Chapter 9, but we provide an introduction here in order to retrieve and display data in our examples. We use the following sample table in this section.
q)t:([] name:`Dent`Beeblebrox`Prefect; iq:98 42 126)
8.3.1 Syntax of select
Basic select is a template that takes the form,
select *cols* from *table*
where table is either a table or a keyed table and cols is a comma-separated list of columns. This expression results in a list of all records for the specified columns. This corresponds to the SQL statement,
SELECT * FROM *table*
In q you do not write the wildcards when you want all columns in the table; simply omit the column list and you get them all.
This basic select syntax may seem comfortably familiar from SQL, but it should seem odd to the q newbie who is just getting accustomed to parsing expressions right-to-left. Neither 'select' nor 'from' represent functions that can stand alone. Instead, they are part of a template and must appear together. There are many options for the basic select template whose elements appear between the 'select' and 'from' or after the table element.
8.3.2 Displaying the Result
The result of select is always a table. You can think of it as a list of (conforming) records but it is constructed as a column dictionary.
q)select from t
name iq
--------------
Dent 42
Beeblebrox 98
Prefect 126
8.3.3 Selecting Columns
To select specific columns, list them comma-separated, sequenced in left-to-right order between select and from. You can optionally provide result column names using the same colon format as table definition syntax.
q)select name from t
name
----------
Dent
Beeblebrox
Prefect
q)select c1:name, c2:iq from t
c1 c2
--------------
Dent 98
Beeblebrox 42
Prefect 126
8.3.4 Basic update
The syntax of update is the same as select, but specifying columns indicates replacement with the values to the right of the colon. In our example everyone gets smarter.
q)update iq:iq+5 from t
name iq
--------------
Dent 103
Beeblebrox 47
Prefect 131
8.4 Primary Keys and Keyed Tables
In SQL, one can declare one or more column(s) of a table as a primary key. This means that the values in the column(s) are unique over the domain of the rows, making it possible to identify and retrieve a row via its key value. These two features motivate how q implements a keyed table.
8.4.1 Keyed Table
We begin with a simple key - i.e., the key is a single column of simple type. The approach is to place the key column in a separate table parallel to a table containing the remaining columns. So we have a table of keys aligned with a table of values. Since a table is logically a list of records, this is the same as a list of key records and list of value records. We establish a positional correspondence between these lists via a dictionary mapping.
A keyed table is a dictionary mapping a table of key records to a table of value records. This represents a mapping from each row in a table of (presumably unique) keys to a corresponding row in a table of values - i.e., a positional correspondence of key rows to value rows. Using dictionary lookup on a key (record) retrieves the corresponding value record in the remaining columns. This is just what a primary key should do. Note that the key mapping assumes that the key records and value records are in corresponding order.
Important
A keyed table is not a table - it is a dictionary and so has type 99h.
Keys should be unique but (sadly) this is not enforced. As we have already noted, dictionary creation using (!) does not enforce key
uniqueness. A value row associated with a duplicate key is not accessible via key lookup, but it can be retrieved via a select on the key column.
8.4.2 Simple Example
Let's see how this works for our previous example. We begin with a flipped column dictionary to make things explicit.
q)v:flip ([name:`Dent`Beeblebrox`Prefect; iq:98 42 126])
Now say we want to add a key column eid containing employee identifiers.
q)k:flip ([eid:1001 1002 1003])
Now establish the mapping between the two tables.
q)kt:k!v
Voilà! The console display of a keyed table is the combination of dictionary display and table display. It lists the key column(s) on the left, separated by a vertical bar from the value columns on the right.
q)kt
eid | name iq
----| --------------
1001| Dent 98
1002| Beeblebrox 42
1003| Prefect 126
8.4.3 Keyed Table Definition Syntax
Fundamentalists might insist on constructing a keyed table as a dictionary of flipped dictionaries. Most folks prefer to use table definition syntax.
Note
Due to space limitations on the printed page we cannot fit our keyed table examples on a single line, so we wrap the lines, with column alignment for readability. You cannot enter such a multiline statement into the q console, but you can place it in a script. Alternatively, while you are following this tutorial line by line, you can copy/paste the individual lines onto a single line in the q console.
Here is the fundamental form of our keyed table.
q)kt:(flip ([eid:1001 1002 1003])!
flip ([name:`Dent`Beeblebrox`Prefect; iq:98 42 126])
It is arguably simpler to use table definition syntax. This is a generalization of simple table definition in which key column(s) are placed between the square brackets and the value columns are after the square brackets as in normal table definition.
q)kt:([eid:1001 1002 1003]
name:`Dent`Beeblebrox`Prefect; iq:98 42 126)
To define an empty keyed table, use empty key and value columns.
q)ktempty:([eid:()] name:(); iq:())
As with regular tables, to ensure data integrity empty columns should be typed with either of the following constructs.
q)ktempty:([eid:`int$()] `$name:(); iq:`long$())
q)ktempty:([eid:0#0] name:0#`; iq:0#0)
8.4.4 Accessing Records of a Keyed Table
Since a keyed table is a dictionary mapping, it provides access to records in the value table via key lookup. Remember that the records in the key table and value table are section dictionaries.
q)kt ([eid:1002])
name| `Beeblebrox
iq | 42
This is a cumbersome way to retrieve by key.
Fortunately, you can abbreviate the full dictionary specification of a key record to its key value. Our example reduces to,
q)kt 1002
name| `Beeblebrox
iq | 42
Now we can lookup the value for an individual column.
q)kt[1002][`iq]
42
Or we can use the equivalent index at depth notation.
q)kt[1002;`iq]
42
After the customary moment of q Zen meditation, we realize that the net effect of placing a key on a table has been to convert indexing of the rows from row number to key value.
8.4.5 Retrieving Multiple Records
We have seen the shortcut to lookup a single record in a keyed table by the naked key value.
q)kt[1001]
name| `Dent
iq | 98
You might think it is possible to retrieve multiple records from a keyed table via a simple list of keys. You would be wrong.
q)kt[1001 1002]
`length
Tip
This does work for a compound key - i.e., a multi-column key - just not for a single column key.
To lookup multiple key values in a keyed table, we can use a list of enlisted keys.
q)kt (enlist 1001; enlist 1002)
name iq
-------------
Dent 98
Beeblebrox 42
q)kt flip enlist 1001 1002
name iq
-------------
Dent 98
Beeblebrox 42
This is still pretty cumbersome, so back to the drawing board. Since we are supposed to provide a list of key records for lookup, we just create an anonymous table whose records are precisely the form we need.
q)kt ([] eid:1001 1002)
name iq
-------------
Dent 98
Beeblebrox 42
Now that's slick!
With such a nifty way to create a list of key records in hand, we recall from 5.2.3 that a sub-dictionary can be extracted using a list of keys as the left operand of (#). Applying this to our keyed table,
q)([] eid:1001 1002)#kt
eid | name iq
----| -------------
1001| Dent 98
1002| Beeblebrox 42
The Zen of keyed tables.
8.4.6 Reverse Lookup
Because a keyed table is a dictionary, it is possible to perform reverse lookup from value records to key records. Let's start with an example having a single value column.
q)kts:([eid:1001 1002 1003] name:`Dent`Beeblebrox`Prefect)
As in the case of key lookup, we can use an anonymous table with a list of value records.
q)kts?([] name:`Prefect`Dent)
eid
----
1003
1001
8.4.7 Components of a Keyed Table
Since a keyed table is a dictionary mapping the table of keys to the table of values, the functions key and value retrieve its
constituent tables.
q)key kt
eid
----
1001
1002
1003
q)value kt
name iq
--------------
Dent 98
Beeblebrox 42
Prefect 126
The function keys returns a list of the key column name(s).
q)keys kt
,`eid
Observe that (cols) retrieves all columns of the keyed table - i.e., from both the key and value tables.
q)cols kt
`eid`name`iq
8.4.8 Tables vs. Keyed Tables
It is possible to convert dynamically between a regular table having column(s) of potential key values and the corresponding keyed table using binary primitive xkey. The right operand is the source table/keyed table and the left operand is a symbol (or list of symbols) with the column name(s) to be used as the key.
q)t:([] eid:1001 1002 1003;
name:`Dent`Beeblebrox`Prefect; iq:98 42 126)
q)`eid xkey t
eid | name iq
----| --------------
1001| Dent 98
1002| Beeblebrox 42
1003| Prefect 126
Conversely, to convert a keyed table to a regular table, use xkey with an empty general list as the left operand.
q)kt:([eid:1001 1002 1003]
name:`Dent`Beeblebrox`Prefect; iq:98 42 126)
q)() xkey kt
eid name iq
-------------------
1001 Dent 98
1002 Beeblebrox 42
1003 Prefect 126
The xkey conversions above work on copies and do not affect the original table. Use call-by-name to modify the original.
q)`eid xkey `t
`t
q)t
eid | name iq
----| --------------
1001| Dent 98
1002| Beeblebrox 42
1003| Prefect 126
q)() xkey `kt
`kt
q)kt
eid name iq
-------------------
1001 Dent 98
1002 Beeblebrox 42
1003 Prefect 126
If xkey is applied with a column that does not contain unique values, the result is not an error but rather a keyed table that does not have a true primary key.
q)tdup:([] eid:1001 1002 1003 1001;
name:`Dent`Beeblebrox`Prefect`Dup)
q)ktdup:`eid xkey t
q)ktdup
eid | name
----| ----------
1001| Dent
1002| Beeblebrox
1003| Prefect
1001| Dup
Duplicate key values are not accessible via key lookup.
q)ktdup 1001
name| Dent
They are accessible via select.
q)select from ktdup where eid=1001
eid | name
----| ----
1001| Dent
1001| Dup
8.4.9 Compound Primary Key
The q implementation of a keyed table as a dictionary mapping between a pair of tables carries over unchanged to compound keys. Recall that a compound key in SQL is a collection of multiple columns that together provide a unique value for each row. A compound key in q is simply a table association k!v in which k has multiple columns. Presumably each record in the key table has a unique combination of field values, but this is not checked.
Here is our galactic travelers table redone to replace the employee ID with a compound key comprising the last and first names.
q)ktc:([lname:`Dent`Beeblebrox`Prefect;
fname:`Arthur`Zaphod`Ford];
iq:98 42 126)
Observe that the console displays a compound keyed table with the key columns on the left of the vertical bar and the value columns to the right.
q)ktc
lname fname | iq
-----------------| ---
Dent Arthur| 98
Beeblebrox Zaphod| 42
Prefect Ford | 126
Here is lookup by a compound key record,
q)ktc[`lname`fname!`Beeblebrox`Zaphod]
iq| 42
q)ktc ([lname:`Beeblebrox; fname:`Zaphod])
iq| 42
As with a simple key, we can abbreviate a full key record to the key value for key lookup.
q)ktc[`Dent`Arthur]
iq| 98
An empty keyed table can be typed using table definition syntax with either of the following,
q)ktc:([lname:`symbol$();fname:`symbol$()] iq:`long$())
q)ktc:([lname:0#`;fname:0#`] iq:0#0)
Note
Placing key column(s) inside the square brackets is consistent with the notation for a (regular) table since a regular table has no keys. You will also observe that dictionary definition syntax looks like a keyed table with no non-key columns.
For the fundamentalists, here is the same compound keyed table built from its constituent column dictionaries constructed from scratch.
q)ktc:(flip `lname`fname!
(`Dent`Beeblebrox`Prefect;`Arthur`Zaphod`Ford))!
flip (enlist `iq)!enlist 98 42 126
Most will agree that the table definition syntax is simpler.
8.4.10 Retrieving Records with a Compound Primary Key
Unlike a simple key, we can lookup multiple value records with a naked list of compound keys.
q)ktc (`Dent`Arthur;`Prefect`Ford)
iq
---
98
126
Of course the nifty construct with an anonymous table works with compound keys too.
q)ktc ([] lname:`Dent`Prefect; fname:`Arthur`Ford)
iq
---
98
126
As does the use of (#) to retrieve a sub keyed table for a list of keys.
q)K:([] lname:`Dent`Prefect; fname:`Arthur`Ford)
q)K#ktc
lname fname | iq
--------------| ---
Dent Arthur| 98
Prefect Ford | 126
8.4.11 Extracting Column Data
In this section we use the following example tables. The first has a simple key of long and the second a compound key of symbols.
q)show kts:([k:101 102 103] v1:`a`b`c; v2:1.1 2.2 3.3)
k | v1 v2
---| ------
101| a 1.1
102| b 2.2
103| c 3.3
q)show ktc:([k1:`a`b`c; k2:`x`y`z] v1:`a`b`c; v2:1.1 2.2 3.3)
q_ktc
k1 k2| v1 v2
-----| ------
a x | a 1.1
b y | b 2.2
c z | c 3.3
In the previous section we saw how to retrieve all the value columns using an anonymous table of keys.
q)kts ([] k:101 103)
v1 v2
------
a 1.1
c 3.3
q)ktc ([] k1:`a`c; k2:`x`z)
v1 v2
------
a 1.1
c 3.3
Often we need to extract the naked column data from some value columns. No problem for the fundamentalist. Since the result of the lookup is a sub-table of the value table, we can index into it to get the column list(s) using indexing at depth.
q)kts[([] k:101 103); `v1]
`a`c
q)ktc[([] k1:`a`c; k2:`x`z); `v1`v2]
`a 1.1
`c 3.3
8.4.12 Pattern Matching
In this section show the basic pattern for pattern matching with a keyed table. See 10.4 for a full treatment of pattern matching.
Keyed tables can be deconstructed using the table pattern.
q)kt:([k:`aa`bb`cc] c1:10 20 30; c2:98.6 99 99.5)
q)([k:sym]c2:px):kt
q)sym
`aa`bb`cc
q)px
98.6 99 99.5
Observe that the key column must be present in the pattern for a keyed table match.
q)([]c1:qty;c2:px):kt
'nyi
[0] ([]c1:qty;c2:px):kt
^
8.5 Foreign Keys and Virtual Columns
A foreign key in SQL is a column in one table whose values are members of a primary key column in another table. Foreign keys are the mechanism for establishing relations between tables.
An important feature of a foreign key is that the RDBMS enforces referential integrity, meaning that values in the foreign key column are required to be in the related primary key column. Before you can insert a row having a foreign key field value that is not there, you must first ensure there is a row with that primary key in the related table.
8.5.1 Definition of Foreign Key
A foreign key in q should provide a relation with referential integrity (in one direction, at least). Can we implement this with a construct we already know in q? The setting is a collection of fields in a column list that are drawn from the (supposedly) unique field values in another column list. Sound familiar? It is precisely the situation of an enumeration!
A foreign key is one or more table columns whose values are defined as an enumeration over the key column(s) of a keyed table. As in the case of symbol enumeration (`sym$), the enumeration restricts foreign key values to be in the list of primary key values, hence establishing referential integrity.
8.5.2 Example of Simple Foreign Key
We return to the galactic travelers keyed table.
q)kt:([eid:1001 1002 1003]
name:`Dent`Beeblebrox`Prefect; iq:98 42 126)
Suppose we have a table with the detail records of the results of repeated IQ tests of the travelers.
([] eid:1003 1001 1002 1001 1002 1001; sc:126 92 36 98 42 101)
This table has no restriction on eid other than it be a long. To ensure that only eid values for actual travelers can be entered, we make the eid column in this table a foreign key related to the eid column in kt. This is done by enumerating over the name of the keyed table - i.e., `kt$.
Tip
When q sees the name of a keyed table in an enumeration domain it knows to use the list of key records.
Here is the enumeration of the column in isolation.
q)`kt$1003 1001 1002 1001 1002 1001
`kt$1003 1001 1002 1001 1002 1001
As in the case of symbol enumeration, q looks up the index of each foreign key value in the list of key records and, under the covers, replaces the field value with that index. As with symbols, the enumeration is displayed in reconstituted form instead of as the underlying indices. To see the underlying indices, cast to an integer.
q)`long$`kt$1003 1001 1002 1001 1002 1001
2 0 1 0 1 0
As usual, the enumeration can be substituted for the original in normal operations.
q)1001=`kt$1003 1001 1002 1001 1002 1001
010101b
And now, the moment of truth. Does the enumeration provide referential integrity?
q)`kt$1004
'cast
It does. Attempting to enumerate a value that is not in the primary key column causes an error.
We put this together with table definition syntax to define a tdetails table with a foreign key over kt.
q)tdetails:([] eid:`kt$1003 1001 1002 1001 1002 1001;
sc:126 92 36 98 42 101)
Observe that a foreign key is denoted by the name of the target keyed table in the f column in the output of (meta).
q)meta tdetails
c | t f a
---| ------
eid| j kt
sc | j
The built-in function fkeys applied to a table (or keyed table) returns a dictionary in which each foreign key column name is mapped to its primary key table name.
q)fkeys tdetails
eid| kt
8.5.3 Resolving a Foreign Key
When you wish to resolve a foreign key - i.e., get the actual values instead of enumerated values - apply (value) to the enumerated column.
q)meta update value eid from tdetails
c | t f a
---| -----
eid| j
sc | j
Observe that there is no longer an entry in the f column.
8.5.4 Foreign Keys and Relations
In SQL, joins are used to splice back together data that has been normalized via relations. The splice is done along a foreign key that establishes a relation to the primary key. In the result, columns from both tables are available using dot notation.
This effect for an inner join is achieved in q using foreign keys but you don't need to perform the join explicitly. The notation is similar, but the operation is different enough to warrant close attention.
Let tf be a table having a foreign key column f enumerated over a keyed table kt. All columns in kt are available via dot notation in a select expression whose from domain is tf. To access a column c in kt, use the notation f.c in the select expression. This column takes the name c by default in the result.
In our galactic travelers example, we can access columns in kt via a select on tdetails. Here is a query that retrieves a
"foreign" column in addition to one from tdetails.
q)select eid.name, sc from tdetails
name sc
--------------
Prefect 126
Dent 92
Beeblebrox 36
Dent 98
Beeblebrox 42
Dent 101
There is an implicit left join between tdetails and kt here.
8.6 Link Columns
Link columns are infrequently used in q, largely because they have been scantily documented and are encountered only as a fallback when it is discovered that foreign key columns can't be splayed. We hope this section helps to remedy that.
A link column is similar to a foreign key, in that its entries are indices of rows in a table The advantages of a link column are:
-
The target can be a table or keyed table.
-
The target can even be the table containing the link column.
-
Link columns can be splayed or partitioned, whereas foreign keys cannot.
The disadvantage is:
- You must create the indices by performing the lookup manually. This can be onerous if you are doing many appends.
A foreign key requires the target be a keyed table but a link column does not. Here we show how to do the manual lookup for a single column.
q)tk:([] id:101 102 103; s:`a`b`c; v:66.6 77.7 88.8)
q)t:([]; id:`tk!(exec id from tk)?102 101 103 101; q:1.1 2.2 3.3 4.4)
q)meta t
c | t f a
--| ------
id| j tk
q | f
Note
As of q3.6 both integers and enumerations are 64 bit.
You can access columns in the target table with dot notation just as you can with a foreign key.
q)select value id, id.id, id.s, q from t
id id1 s q
------------
1 102 b 1.1
0 101 a 2.2
2 103 c 3.3
0 101 a 4.4
Here is an example with a compound column link. We have split the creation of the link column into two steps due to space limitations on the page but it could be done in one step. Note that anonymous tables make the lookup a piece of cake.
q)tk:([] fname:`nuba`devi`lorenzo; lname:`cat`cat`dog; age:11 11 4)
q)vax:([]fn:`nuba`lorenzo; ln:`cat`dog; dt:2025.01.01 2025.07.01)
q)update fln:`tk!(`fname`lname#tk)?([]fname:fn; lname:ln), dt from `vax
q)select fln.fname, fln.lname,dt from vaxlnk where dt>2025.01.01
fname lname dt
------------------------
nuba cat 2025.04.01
lorenzo dog 2025.07.01
Here is an example that uses a link column to implement a hierarchical structure in a table. The column pid is a link column that relates a row to its parent row.
q)tree:([] id:0 1 2 3 4; pid:`tree!0N 0 0 1 1; v:100 200 300 400 500)
q)select from tree where pid=0 / find children of root
id pid v
----------
1 0 200
2 0 300
8.7 Working with Tables and Keyed Tables
In this section, we use the galactic travelers tables.
q)t:([] name:`Dent`Beeblebrox`Prefect; iq:98 42 126)
q)kt:([eid:1001 1002 1003] name:`Dent`Beeblebrox`Prefect;
iq:98 42 126)
8.7.1 Appending Records
The fundamental way to append a record to a table is to view the table as a list of records and join in place with ,:. Note that the fields in the record do not need to be in column order.
q)t,:([name:`W; iq:26])
q)t,:`iq`name!(200; `Albert)
q)t
name iq
--------------
Dent 98
Beeblebrox 42
Prefect 126
W 26
Albert 200
You can also append naked field values with ,: in which case the fields must be in column order.
q)t,:(`H; 142)
q)t,:(97;`J)
'type
8.7.2 First and Last Records
Because a table is logically a list of records, the functions first and last retrieve the initial and final records, respectively.
q)first t
name| `Dent
iq | 98
q)last t
name| `H
iq | 142
Because tables and keyed tables are ordered, these functions can be used for aggregation in queries without any need of the SQL clause ORDER BY, provided your table was created in order.
You can retrieve the first or last n records of a table or keyed table using the take operator #. Why does this work? Tables are lists and keyed tables are dictionaries and # works on both. Since # always returns a list and the extracted records conform, q recognizes the result of # as a table or keyed table with the same schema as the input.
q)2#t
name iq
-------------
Dent 98
Beeblebrox 42
q)-3#kt
eid | name iq
----| --------------
1001| Dent 98
1002| Beeblebrox 42
1003| Prefect 126
q).Q.s1 0#t
"+`name`iq!(`symbol$();`long$())"
Also see 9.3.2.3 for another way to achieve this result using select[n].
8.7.3 Find
The Find operator ? used with a table returns the index of a record - i.e., its row number.
q)t?([name:`Dent; iq:98])
0
As usual, the record can be abbreviated to a row list provided the fields have exactly the right type and order.
q)t?(`Dent;98)
0
Since Find is atomic in the right operand, you can determine multiple row indices.
q)t?((`Dent;98);(`Prefect;126))
0 2
Since a keyed table is a dictionary, Find performs a reverse lookup of a value record/row and returns the first associated key record.
q)kt?([name:`Dent; iq:98])
eid| 1001
q)kt?(`Dent;98)
eid| 1001
As with key lookup, a single column expects enlisted values, or you can use the anonymous table construct.
q)t1:([] eid:1001 1002 1003)
q)t1?enlist each 1001 1002
0 1
q)t1?([] eid:1001 1002)
0 1
8.7.4 Union with ,
The join operator , is defined for tables and keyed tables since they both comprise lists of records. It is essentially the same as UNION in SQL.
You can use , to append a record to a copy of a table, but no type checking is performed.
q)t,([name:`Slaartibartfast; iq:110])
name iq
-------------------
Dent 98
Beeblebrox 42
Prefect 126
Slaartibartfast 110
Tip
Using a row with , will not yield a table. Instead you get a general list.
q)t,(`Slaartibartfast; 110)
`name`iq!(`Dent;98)
`name`iq!(`Beeblebrox;42)
`name`iq!(`Prefect;126)
`Slaartibartfast
110
Tables having exactly the same meta result can be joined. Since a table is a list of records, the result is obtained by appending the records of the right operand to those of the left.
q)t,([] name:1#`W; iq:1#26)
name iq
--------------
Dent 98
Beeblebrox 42
Prefect 126
W 26
q)t,t
name iq
--------------
Dent 98
Beeblebrox 42
Prefect 126
Dent 98
Beeblebrox 42
Prefect 126
In contrast to earlier versions of q, two tables with the same columns in different order can be joined.
q)t,([] iq:1#42; name:`W)
name iq
--------------
Dent 98
Beeblebrox 42
Prefect 126
W 42
Two keyed tables with the same meta can be joined with ,. Because a keyed table is a dictionary whose keys and values are record lists, the operation has upsert semantics. Keys in the right operand that are not in the left operand are treated as append (i.e., insert), whereas the right operand acts as an update on common key values.
q)kt,([eid:1003 1004] name:`Prefect`T; iq:150 26)
eid | name iq
----| --------------
1001| Dent 98
1002| Beeblebrox 42
1003| Prefect 150
1004| T 26
8.7.5 Coalesce (^)
Coalesce (^) can be used to merge two keyed tables having the same columns. Its behavior derives from its behavior on dictionaries. For a common key value and a common column, the value of the column in the right keyed table prevails over that of column in the left keyed table, except where the right column is null, in which case the left column value survives. On non-common keys the individual values carry through.
Tip
The behavior of ^ is the same as , when there are no nulls in a column in the right table.
k| v
-| ---
a| 100
b| 200
c| 30
q)([k:`a`b`c`x] v:10 0N 30 40)^([k:`a`b`c`y]; v:100 200 0N 0N)
k| v
-| ---
a| 100
b| 200
c| 30
x|
The performance of (^) is slower than that of , since fields of the right operand must be checked for null.
8.7.6 Column Join
Two tables with the same number of records can be joined sideways with join-each ,' to create a column join in which the columns are aligned in parallel
q)([] c1:`a`b`c),'([] c2:100 200 300)
c1 c2
------
a 100
b 200
c 300
When the column lists of the tables are not disjoint, the operation on the common columns has upsert semantics because each record is a dictionary.
q)([] c1:`a`b`c; c2:1 2 3),'([] c2:100 200 300)
c1 c2
------
a 100
b 200
c 300
A sideways join on keyed tables requires that the key records conform, meaning that the key columns must have identical meta. The columns from the right operand are aligned along common keys and appended elsewhere.
q)([k:1 2 3] v1:10 20 30),'([k:3 4 5] v2:1000 2000 3000)
k| v1 v2
-| -------
1| 10
2| 20
3| 30 1000
4| 2000
5| 300
8.8 Complex Column Data
8.8.1 Simple Example
There is no restriction on the column lists of a table other than they are rectangular. In practice, simple lists are preferable because they are faster, more storage efficient and are easier to process. There are situations in which it may be convenient to use nested column lists, although far less often than most q programmers seem to think. In general, it is more efficient to flatten columns and then use joins or grouping to create structure on the way out. But for those who just must have nested column structure, we provide examples here.
Suppose we want to keep track of a pair of daily observations, say a low temperature and a high temperature for Hawaii in Fahrenheit. The normal form stores the low and high values in separate columns with flat lists for rows and columns.
q)tf:([] d:2025.01.01 2025.01.02; l:67.9 72.8; h:82.1 88.4)
q)tf 0
d| 2025.01.01
l| 67.9
h| 82.1
q)tf `l
67.9 72.8
Alternatively, we can store pairs in a single column, resulting in nested lists for records and columns.
q)tp:([] d:2025.01.01 2025.01.02; lh:(67.9 82.10; 72.8 88.4))
q)tp 0
d | 2025.01.01
lh| 67.9 82.1
q)tp `lh
67.9 82.1
72.8 88.4
This example can easily be generalized to the situation of n-tuples. For example, we could store the daily values of a yield curve. Further, the fields can have different length or even different type, although the latter is discouraged as it will be slower in kdb.
8.8.2 Operations on Compound Column Data
The case of a nested column in which all items are simple lists of the same type is handled specially in kdb. We call this a compound column. There is no requirement that the simple lists all have the same length.
As an example, say we want to analyze the weekly gross revenues for movies and we don't care about the titles (we don't have room to display them here). Since there will be a different number of movies in release each week, the number of observations in each field will vary. An oversimplified time series that fits within the margins looks like the following in which the gross revenues are in millions.
tm:([] wk:2025.01.01 2025.01.08;
rv:(38.92 67.34; 16.99 5.14 128.23 31.69))
q)tm
wk rv
----------------------------------
2025.01.01 38.92 67.34
2025.01.08 16.99 5.14 128.23 31.69
Storing complex values in a single column in a table enables sophisticated operations to be performed in a single expression. The rule for complex data is:
Nested columns mean iterators. Lots of iterators.
Using our movie data, we can produce the sorted gross, the average and high gross for each week in one expression.
q)select wk, srt:desc each rv, avgr:avg each rv,
hi:max each rv from tm
wk srt avgr hi
-------------------------------------------------
2025.01.01 67.34 38.92 53.13 67.34
2025.01.08 128.23 31.69 16.99 5.14 45.5125 128.23
While sorts and aggregates such as MAX and AVG are standard SQL, think of how you'd produce the sorted sub list and the aggregates together. In your favorite traditional programming environment, you'll soon discover that you need a sordid list of rows and a loop to unravel.
Now let's compute the drops between the ranked revenue numbers within each week. Because the sorted detail items are rows in SQL, this requires a loop. In q,
q)select wk, drp:neg 1_'deltas each desc each rv from tm
wk drp
---------------------------
2025.01.01 ,28.42
2025.01.08 96.54 14.7 11.85
8.8.3 Compound Foreign Key
A nested column is how to make a foreign key on a compound primary key. We recast the galactic travelers to make a keyed table with common key of last and first name.
q)ktc:([lname: `Dent`Beeblebrox`Prefect;
fname:`Arthur`Zaphod`Ford]; iq:98 42 126)
We create a details table with a foreign key over ktc by placing the names in the foreign key column. Remember to copy/paste the individual lines together into the q console.
q)tdetails:([] name:`ktc$(`Beeblebrox`Zaphod;
`Prefect`Ford;`Beeblebrox`Zaphod);
sc:36 126 42)
The columns of ktc are available as virtual columns from tdetails using dot notation on the foreign key.
q)select name.lname, name.iq, sc from tdetails
lname iq sc
------------------
Beeblebrox 42 36
Prefect 126 126
Beeblebrox 42 42
When defining the schema of a table with a compound key column, specify the foreign key as the cast of an empty list.
q)meta ([] name:`ktc$(); sc:`long$())
c | t f a
----| -------
name| j ktc
sc | j
When the foreign key comprises multiple types you also cast the general list with the foreign table in its schema.
q)ktc:([k1:1 2 3; k2:2001.01.01 2001.01.02 2001.01.03]
v: 1. 2. 3.)
q)tfc:([] fk:`ktc$(); s:`symbol$())
q)tfc,:(`ktc$(1; 2001.01.01); `a)
q)tfc,:(`ktc$(1; 2025.01.01); `a)
'cast
The failure of the cast in the second append is correct; it is enforced referential integrity.
8.9 Attributes
Attributes are metadata that you attach to lists of special forms. They are used on a dictionary domain or a table column to speed retrieval for some operations. The q interpreter can make certain optimizations based on the structure of the list implied by the attribute.
Attributes (other than `g#) are descriptive rather than prescriptive. By this we mean that by applying an attribute you are asserting that the list has a special form, which q will check. It does not instruct q to (re)make the list into the special form; that is your job. A list operation that respects the form specified by the attribute leaves the attribute intact (other than `p#), while an operation that breaks the form results in the attribute being removed in the result.
The syntax for applying an attribute is (yet) another overload of #, whose left operand is a symbol specifying the attribute and whose right operand is the target list.
Note
The attribute `s# is applied to the target list in place but not for the others.
KX says not to expect significant benefit from an attribute for less than a million items; you should test thoroughly with your own data. This is why attributes are not automatically applied in mundane situations. You should test your use case to see whether applying an attribute provides performance benefit. Do not just apply attributes blindly, as they consume resources.
8.9.1 Sorted (`s#)
Applying the sorted attribute (`s#) to a simple list indicates that the items of the list are sorted in ascending order; there is no way to indicate a descending sort. When a list has the sorted attribute, linear search is replaced with binary search, which makes certain operations faster - for example find ?, equality =, in and within.
When an attribute is successfully applied to a list, it becomes part of the list and is displayed on the q console. Observe that q checks to see that the list is actually sorted.
q)`s#1 2 4 8
`s#1 2 4 8
q)`s#2 1 3 4
's-fail
The sort function asc automatically applies the sorted attribute to its result but til does not.
q)asc 2 1 8 4
`s#1 2 4 8
q)til 5
0 1 2 3 4
When a list with an attribute is amended with ,: the result is checked to see that the attribute is preserved; if not, it is removed.
q)L:`s#1 2 3 4 5
q)L,:6
q)L
`s#1 2 3 4 5 6
q)L,:0
q)L
1 2 3 4 5 6 0
One place to apply the sorted attribute is on the date or time column of a simple time series.
q)t:([] ti:`s#00:00:00 00:00:01 00:00:03; v:98 98 100.)
q)meta t
c | t f a
--| -----
ti| v s
v | f
Applying the sorted attribute to a table somewhat unintuitively applies the parted attribute (see next section) to the first column.
q)meta `s#([] ti:00:00:00 00:00:01 00:00:03; v:98 98 100.)
c | t f a
--| -----
ti| v p
v | f
Applying the sorted attribute to a dictionary makes the dictionary into a step function where lookup can be much faster because binary search is used. In this case `s# is applied to the keys but also to the dictionary as whole. In the following examples, we use the utility .Q.s1 to make things explicit in the display. In the first case we apply `s# only to the keys and we see that lookup does not function as a step function.
q).Q.s1(`s#1 3 5)!`a`b`c
"`s#1 3 5!`a`b`c"
q)((`s#1 3 5)!`a`b`c)2
`
In the next case we apply `s# to the dictionary. We see that it is actually applied to both the dictionary and the key list and that lookup does function as a step function.
q).Q.s1(`s#1 3 5!`a`b`c)
"`s#`s#1 3 5!`a`b`c"
q)(`s#1 3 5!`a`b`c)2
`a
Since a keyed table is a dictionary of tables, applying the sorted attribute to a keyed table applies to the table of keys that, in turn, applies to the initial key column.
q)meta `s#([k:1 2 3 4] v:`d`c`b`a)
c| t f a
-| -----
k| j s
v| s
q).Q.s1 `s#([k:1 2 3 4] v:`d`c`b`a)
"`s#(`s#+(,`k)!,`s#1 2 3 4)!+(,`v)!,`d`c`b`a"
8.9.2 Unique (`u#)
Applying the unique attribute (`u#) to a list indicates that the items of the list are to a couple. Knowing that the elements of a list are unique makes (distinct) the identity function and shortens some operations - i.e., if an operation has found one of what you're looking for, it's done.
Note
Applying (`u#) to a column essentially causes q to create a hash table, which uses storage and adds overhead. It is best to apply the attribute after the list has been created, if possible.
Observe that uniqueness is checked. Don't take the error message personally.
q)`u#2 1 4 8
`u#2 1 4 8
q)`u#2 1 4 8 2
'u-fail
An amend that does not preserve uniqueness causes the attribute to be lost.
q)L:`u#2 1 4 8
q)show L,:3
`u#2 1 4 8 3
q)show L,:2
2 1 4 8 3 2
The unique attribute can be applied to the domain of a dictionary, a column of a table, or the key column of a keyed table. It cannot be applied to a dictionary, a table or a keyed table directly.
8.9.3 Parted (`p#)
The parted attribute (`p#) indicates that all common occurrences of a value in a list are adjacent. Viewing the list as a map, the parted attribute says that its graph is a step function with distinct steps. The parted attribute can be applied to a simple list of any type whose underlying value is integral - e.g., the integer types, dates, times, timestamps, timespans. You can also apply parted on a list of enumerated symbols since the underlying index values are integral.
Note
The parted attribute causes q to create an underlying structure that keeps track of the steps. Because the attribute is not preserved under any operation (even if it can/should be), you should apply it only after the list is fully created.
Here is a simple example of a parted list. Notice that the items are neither sorted nor unique.
q)`p#2 2 2 1 1 4 4 4 4 3 3
`p#2 2 2 1 1 4 4 4 4 3 3
Notice that that applying parted in error does not say `p-fail; rather it generates a 'u-fail error.
q)`p#2 2 2 1 1 4 4 4 4 3 3 2
'u-fail
[0] `p#2 2 2 1 1 4 4 4 4 3 3 2
^
The parted attribute is not preserved under any operation on the list, even if the operation preserves the property.
q)L:`p#1 1 2 3 3
q)show L,:3
1 1 2 3 3 3
Note
Historical time series databases for ticker symbols are usually sorted by time within symbol with the parted attribute applied to the (enumerated) symbol column. This makes queries by ticker fast and guarantees that results for a given symbol are returned in time order.
8.9.4 Grouped (`g#)
The grouped attribute (`g#) differs from other attributes in that it can be applied to any list. It causes q to create and maintain an index – essentially a hash table. Grouped can be applied to a list when no other assumptions about its structure can be made.
For example,
q)`g#1 2 3 2 3 4 3 4 5 2 3 4 5 4 3 5 6
`g#1 2 3 2 3 4 3 4 5 2 3 4 5 4 3 5 6
q)show L:`g#100?100
`g#70 81 2 59 48 91 48 47 32 49 18 74 27 37 25 66 43 36 80 20 ..
q)L,:1 1 1 1
q)show -5#L
25 1 1 1 1
The grouped attribute is maintained as operations are performed on the list, which can cause significant processing overhead in addition to the storage required. Best to apply it after the entire list has been created or updated, if possible.
Applying the grouped attribute to a table column roughly corresponds to placing an index on a column in an RDBMS. As of q3.2, the maximum number of grouped attributes that can be placed on a single table is unlimited.