Send Feedback
Skip to content

A Brief Introduction to q and KDB-X

Welcome! This page introduces the basics of q and KDB‑X. Through simple, workable examples, you’ll learn how to create data, run queries, and understand the core principles of q’s concise syntax and high‑performance design. No prior knowledge is required. You’ll pick up the essentials as you work through the exercises.

KDB-X is a powerful ecosystem built on top of q. The q language is a concise, expressive, Turing-complete, interpreted programming language with a built-in database engine optimized for streaming, real-time, and historical data.

!!! If you don't have KDB-X installed yet, follow this quick install KDB-X guide.

Launch q

In your terminal, type q to start an interactive session. When the q) prompt appears, the interpreter is ready.

$ q
KDB-X 5.0.20251113 2025.11.13 Copyright (C) 1993-2025 Kx Systems
...

q)

Standard constructs

Like most languages, q allows you to create scalars, lists, and dictionaries, assign them to variables (using the colon :), define functions, use execution controls (like if-then-else), and call built-in operators and functions. Below are some common q commands and their Python equivalents:

q)n:8                  / Assign an integer
q)n
8
q)b:0b                 / A boolean (0b for false, 1b for true)

q)show l:reverse til 5 / Create a list 0-4 and reverse it
4 3 2 1 0
q)(8; 3.14; ("Alice"; "Bob"; "Mike")) / A mixed list
8
3.14
("Alice";"Bob";"Mike")
/ Assign to multiple calues
q)(n; pi; friends): (8; 3.14; ("Alice"; "Bob"; "Mike")) / pattern matching (aka unpacking)

q)show contacts:([Alice: "555-0101"; Bob: "555-0723"; Mike: "555-6666"])
Alice| "555-0101"
Bob  | "555-0723"
Mike | "555-6666"
q)callRandomFriend:{f: rand key contacts; show "Calling ", string[f], " at ", contacts f}
q)callRandomFriend[]
"Calling Bob at 555-0723
q)f:{[r] 1 % r*r}               / Division is denoted by %
q)if[n<14; show "I'm a child!"] / if statement
"I'm a child!"
>>> n = 8
>>> n
8
>>> b = False

>>> l = list(reversed(range(5)))
>>> l
[4, 3, 2, 1, 0]
>>> [8, 3.14, ["Alice", "Bob", "Mike"]]
[8, 3.14, ['Alice', 'Bob', 'Mike']]
>>> n, pi, friends = [8, 3.14, ["Alice", "Bob", "Mike"]]

>>> contacts = {"Alice": "555-0101", "Bob": "555-0723", "Mike": "555-6666"}
>>> contacts
{'Alice': '555-0101', 'Bob': '555-0723', 'Mike': '555-6666'}
>>> import random
>>> def callRandomFriend():
...     key, value = random.choice(list(contacts.items()))
...     print(f"Calling {key} at {value}")
...
>>> callRandomFriend()
Calling Mike at 555-6666
>>> def f(r):
...     return 1 / (r * r)
>>> if n < 15:
>>> ...     print("I'm a child!")
    ...
I'm a child!

Use exit 0, \\ or Ctrl-D (i.e. EOF) to exit a q session.

You can put your q commands into a text file and run it:

$ q myscript.q

or load it into your q session:

q)\l myscript.q

The beauty of q

In this section, we collect some special features of q.

Minimalist syntax (no noise)

The q language descends from APL (A Programming Language), which was built around mathematical notation. Lists, dictionaries, and functions are unified concepts in q: they are all mappings. Because of this, you use the same square-bracket notation to access them.

q)l[2]              / Indexing a list
7
q)contacts[`Alice]  / Looking up a dictionary by a key
"555-0101"
q)f[5]              / Applying a function
0.04
>>> l[2]
2
>>> contacts["Alice"]
'555-0101'
>>> f(5)
0.04

This is polymorphism at its most fundamental level. To further reduce "noise", q allows you to omit brackets and use whitespace to separate list items.

q)l 2    / Equivalent to l[2]
7
q)4 1 7  / A list of integers (no commas or parentheses needed)
4 1 7

Reducing boilerplate code is a basic principle in q.

Right-to-left evaluation

Unlike most languages, q has no operator precedence. Expressions are evaluated strictly from right to left.

q)2*1+3 / 1+3 is 4, then 2*4
8
q)3+2>1 / True is converted to 1
4

You can use parentheses to override this order, but to keep the code clean, q developers often simply rearrange the expression:

q)3+2*1     / Instead of (2*1)+3
5
q)1<3+2     / Instead of (3+2)>1
1b

This encourages linear thinking: you chain operations together, much like a Linux pipe, processing data from right to left.

Vector operations

q is a vector programming language. Most operators work on entire lists automatically without the need for explicit loops (like for or list comprehension in Python).

q)show l:reverse til 5
4 3 2 1 0
q)2*l               / Scalar multiplication across a list
0 2 4 6
q)1 2 3 + 10 20 30  / Adding two lists element-wise
11 22 33
q)l 3 0             / Indexing by a list
1 4
>>> l = list(reversed(range(5)))
>>> l
[4, 3, 2, 1, 0]
>>> [2 * x for x in l]
[8, 6, 4, 2, 0]
>>> [x + y for x, y in zip([1, 2, 3], [10, 20, 30])]
[11, 22, 33]
>>> [l[i] for i in [3, 0]]
[1, 4]

Functional programming

The q language also treats functions as first-class citizens. Higher-order functions (called Iterators in q) make complex data manipulation extremely concise.

q)count each (1 2; 5 4 3; til 20)   / Apply 'count' to each sub-list
2 3 20
q)(+) scan 1 2 3                    / Running sum
1 3 6
>>> list(map(len, [[1, 2], [5, 4, 3], range(20)]))
[2, 3, 20]
>>> from functools import reduce
>>> reduce(lambda acc, x: acc + [acc[-1] + x] if acc else [x],[1, 2, 3], [])
[1, 3, 6]

Interned strings: symbols

Symbols are atomic entities preceded by a backtick (for example,`AAPL). Internally, q stores these as integers in a lookup table (a process called interning). This makes comparing two symbols — like checking if a ticker in a billion-row table matches`AAPL — incredibly fast, as the computer only has to compare two integers rather than checking every letter in a word.

q)friends:`Alice`Bob`Mike   / List of symbols
q)friends?`Mike             / Reverse lookup: find the index of Mike
2

Extreme terseness

The trade-off for q's power is brevity. q developers value minimal keystrokes, which does lead to heavy overloading of symbols. For example, the ? symbol can perform ten different operations depending on its arguments. In the previous section, we saw that it can denote reverse lookup; below we show three other usages (called roll, deal and permute) related to random number generation:

q)rand 10
9
q)4?10          / Four random integers
4 5 4 2
q)show l:-4?10  / Four random integers without duplications
6 0 8 5
2)0N?l          / Permutation
8 6 0 5
>>> import random
>>> random.randint(0,9)
9
>>> [random.randint(0, 9) for _ in range(4)]
[6, 0, 8, 5]
>>> random.sample(range(0, 10), 4)
[8, 6, 0, 5]

Tables

Tables are treated as first-class citizens in q, which means they are a primary data type just like integers or lists. You can think of a table from two different perspectives:

  1. A list of rows: where each row is a dictionary.
  2. A list of columns: where each column is a named list of values.

While you can interact with a table as a list of rows, q stores them internally as a list of columns. This columnar structure is the secret to q's performance advantage in data analysis.

Creating tables

A list of dictionaries is represented as a table, if the keys are the same:

q)(([name: `Alice; phone: "555-0101"; age: 23]); ([name: `Bob; phone: "555-0723"; age: 32]); ([name: `Mike; phone: "555-6666"; age:
 22]))
name  phone      age
--------------------
Alice "555-0101" 23
Bob   "555-0723" 32
Mike  "555-666"  22

You can create a table by defining its columns directly. The syntax ([] ...) denotes an unkeyed table:

q)show t:([] name: `Alice`Bob`Mike; phone: ("555-0101"; "555-0723"; "555-6666"); age: 23 32 22)
name  phone      age
--------------------
Alice "555-0101" 23
Bob   "555-0723" 32
Mike  "555-6666" 22

Because tables are integrated into the language, you can manipulate them with standard list and dictionary syntax:

q)t 1           / Get the second row
name | `Bob
phone| "555-0723"
age  | 32
q)avg t`age     / Get the average of the age column
25.66667

q-sql

While q is a functional language, it includes a built-in query language called q-sql. It looks similar to SQL but is more expressive and follows q's right-to-left evaluation rules.

To demonstrate, we will use synthetic capital markets data generated by the KDB-X datagen module:

q)([getInMemoryTables]): use `kx.datagen.capmkts    / Load the module
q)(trade; quote; nbbo; master; exnames): getInMemoryTables[]
q)trade
sym  time                 price size stop cond ex
-------------------------------------------------
SOFI 0D09:30:01.180477706 214   36   0    K
AMZN 0D09:30:01.490170061 92.11 90   1    T    A
SNAP 0D09:30:02.534750053 9     74   0    T
SNAP 0D09:30:05.617603533 9     84   0    L
TSLA 0D09:30:06.389750220 62.97 62   0    Z
PEP  0D09:30:08.910057414 22    23   0    U    Y
..
q)count quote / number of rows
13497

Simple queries

In q-sql, you don't need SELECT *. If you don't specify columns, q assumes you want all of them.

q)select from trade
sym  time                 price size stop cond ex
-------------------------------------------------
TXN  0D09:30:18.828937844 18.02 99   0    9
GOOG 0D09:30:22.425490937 72.02 92   0    P    M
T    0D09:30:40.218699347 18.01 97   0
XPEV 0D09:33:31.365513849 6.01  99   0
T    0D09:33:37.277742547 18.03 93   0    X
XPEV 0D09:35:00.264738568 6.01  92   0    9
SBUX 0D09:36:32.798154308 5.03  98   0    M
HPQ  0D09:36:37.699847666 36.17 98   0    I    N
..

For anyone coming from a traditional database background, KDB-X also provides a standard SQL interface:

q).s.init[]         / initialize SQL interface
q)s)SELECT * FROM trade WHERE size > 90     / use 's)' prefix for SQL
sym  time                 price size stop cond ex
-------------------------------------------------
TXN  0D09:30:18.828937844 18.02 99   0    9
GOOG 0D09:30:22.425490937 72.02 92   0    P    M
T    0D09:30:40.218699347 18.01 97   0
XPEV 0D09:33:31.365513849 6.01  99   0
T    0D09:33:37.277742547 18.03 93   0    X
XPEV 0D09:35:00.264738568 6.01  92   0    9
SBUX 0D09:36:32.798154308 5.03  98   0    M
HPQ  0D09:36:37.699847666 36.17 98   0    I    N
..

The real power of q-sql appears when you combine it with q's vector capabilities. For example, you can calculate total volume by exchange:

q)select sum size by ex from trade
ex| size
--| -----
  | 21579
A | 2512
B | 2191
C | 2482
D | 3227
I | 2811
J | 2368
K | 3097
..

Because q handles dictionaries and vectors natively, you can perform joins inline without complex syntax. In this example, the exnames dictionary maps exchange IDs to their full names directly:

q)exnames `A`B  / Indexing a dictionary by a list
"NYSE American"
"NASDAQ OMX BX"
q)select sum size by exnames ex from trade
ex                              | size
--------------------------------| -----
""                              | 21579
"Cboe BYX Exchange"             | 1796
"Cboe BZX Exchange"             | 2320
"Cboe EDGA Exchange"            | 2368
"Cboe EDGX Exchange"            | 3097
"Chicago Broad Options Exchange"| 2551
"Chicago Stock Exchange"        | 2203
"Consolidated Tape System"      | 2631
..

This demonstrates q’s "zero noise" principle. In SQL, this would require a formal JOIN statement; in q, it is a simple dictionary lookup applied across a vector. This advantage is more pronounced as queries get more complex, as the q implementation remains maintainable and readable.

Time-series support

Q was built for time-series data. It treats temporal types (times, dates, timestamps, timedeltas) as first-class symbols. You can cast data types on the fly — for instance, using `minute$time to group data by the minute:

/ Average mid-price for TSLA between 1 PM and 2 PM, grouped by minute
q)select avgMid: avg (bid + ask)%2 by `minute$time from quote where sym=`TSLA, time within 13:00 14:00
time | avgMid
-----| --------
13:00| 64.4125
13:03| 64.66
13:04| 64.4875
13:07| 64.3425
13:08| 64.64833
13:09| 64.32
13:10| 64.5525
13:12| 64.695
..

Joins

q supports standard relational joins like left loin (lj) an inner join (ij) but is most famous for its specialized temporal joins.

To join metadata (like company descriptions) from a master to a trade based on the common field sym:

q)trade lj `sym xkey master
sym  time                 price size stop cond ex description                 issueprice
----------------------------------------------------------------------------------------
SOFI 0D09:30:01.180477706 214   36   0    K       SoFi Technologies, Inc.     214
AMZN 0D09:30:01.490170061 92.11 90   1    T    A  Amazon.com, Inc.            92
SNAP 0D09:30:02.534750053 9     74   0    T       Snap Inc.                   9
SNAP 0D09:30:05.617603533 9     84   0    L       Snap Inc.                   9
TSLA 0D09:30:06.389750220 62.97 62   0    Z       Tesla, Inc.                 63
..

You can run queries on the joined table:

q)select open: first price, close: last price by description from trade lj `sym xkey master
description                | open  close
---------------------------| -----------
ADVANCED MICRO DEVICES     | 33.05 34.62
AMERICAN INTL GROUP INC    | 27.03 28.9
APPLE INC COM STK          | 84.1  86.92
AT&T Inc.                  | 18.01 19.06
..

In financial data, trades and quotes rarely happen at the exact same time (q has nanosecond precision). An Asof Join aligns two tables by finding the "prevailing" value. For every trade, aj finds the most recent quote that occurred at or before that trade's time:

/ Matches each trade with the symbol's quote valid at that moment
q)aj[`sym`time; trade; quote]
sym  time                 price size stop cond ex bid    ask    bsize asize mode
--------------------------------------------------------------------------------
SOFI 0D09:30:01.180477706 214   36   0    K       213.37 214.45 13    39    Q
AMZN 0D09:30:01.490170061 92.11 90   1    T    A  91.56  92.14  17    32    E
SNAP 0D09:30:02.534750053 9     74   0    T       8.44   9.04   18    91    M
SNAP 0D09:30:05.617603533 9     84   0    L       8.17   9.66   80    68    4
..

A window join is a powerful generalization of the asof join. Instead of taking just the last value, it looks at a window of time around each record and performs an aggregation (like an average or max).

Example: calculate the volume-weighted average price (VWAP) for quotes in a window starting 1 minute before and ending 5 seconds after each trade:

q)wj[-00:01 00:00:05+\:trade.time; `sym`time; trade; (quote; (wavg;`asize;`ask); (wavg;`bsize;`bid))]
sym  time                 price size stop cond ex ask      bid
-------------------------------------------------------------------
SOFI 0D09:30:01.180477706 214   36   0    K       66.43636 65.54799
AMZN 0D09:30:01.490170061 92.11 90   1    T    A  65.21634 51.99918
SNAP 0D09:30:02.534750053 9     74   0    T       57.21473 52.7337
SNAP 0D09:30:05.617603533 9     84   0    L       50.89472 52.93455
..

Persistence

This tutorial has worked exclusively with in-memory objects so far. If you close your q session, these objects vanish. To keep your data, use the set function to persist it to disk.

Simple persistence

In q, file paths are represented as symbols prefixed with a colon (e.g. `:kdbdata). You can save any q object — variables, dictionaries, or even functions — directly to a file.

q)contacts:([Alice: "555-0101"; Bob: "555-0723"; Mike: "555-6666"])
q)`:kdbdata/contacts set contacts   / Save a dictionary
q)`:kdbdata/callRandomFriend set {f: rand key contacts; show "Calling ", string[f], " at ", contacts f}
q)t: ([] name: `Alice`Bob`Mike; phone: ("555-0101"; "555-0723"; "555-6666"); age: 23 32 22)
q)`:kdbdata/t set t                 / Save a table

These objects are saved in a high-performance binary format. From a new q session, you can bring them back using get:

q)get `:kdbdata/contacts
Alice| "555-0101"
Bob  | "555-0723"
Mike | "555-6666"
q)get `:kdbdata/t
name  phone      age
--------------------
Alice "555-0101" 23
Bob   "555-0723" 32
Mike  "555-6666" 22

If a directory contains multiple kdb+ files, you can load the entire directory at once using the \l command. This automatically assigns the file names as variable names in your session:

q)\l kdbdata    / Load everything in the 'kdbdata' folder
q)contacts      / 'contacts' is now available in the workspace
Alice| "555-0101"
Bob  | "555-0723"
Mike | "555-6666"
q)callRandomFriend[]
"Calling Alice at 555-0101"
q)t
name  phone      age
--------------------
Alice "555-0101" 23
Bob   "555-0723" 32
Mike  "555-6666" 22

Scaling up: splaying and partitioning

While the approach above is fine for small objects, it has an important limitation: it copies the entire file into your RAM (with the exception of homogenous list files). For analysts working with gigabytes or terabytes of data, this isn't feasible.

This is where q’s memory-mapping shines. Instead of loading a file, q "maps" it. It only reads the specific pieces of data from the disk when your query actually asks for them.

For better performance, we "splay" a table — meaning we save each column as its own individual file. This allows q to perform Columnar I/O: if you only want to calculate the average price, q only reads the price file and ignores size, time, and ex.

To handle massive datasets, tables are divided into partitions, typically by date.

This example uses the datagen module to build a multi-day, partitioned database on disk:

q)([getInMemoryTables; buildPersistedDB]): use `kx.datagen.capmkts  / Load the module
q)buildPersistedDB["/tmp/kdbdb"; 10000; ([start: 2026.02.01; end: 2026.02.02])]

If you look at the file system, you should see a clean, hierarchical structure:

$ tree /tmp/kdbdb
/tmp/kdbdb
├── 2026.02.01
│   ├── quote
│   │   ├── asize
│   │   ├── ask
│   │   ├── bid
│   │   ├── bsize
│   │   ├── ex
│   │   ├── mode
│   │   ├── sym
│   │   └── time
│   └── trade
│       ├── cond
│       ├── ex
│       ├── price
│       ├── size
│       ├── stop
│       ├── sym
│       └── time
├── 2026.02.02
│   ├── quote
│   │   ├── asize
│   │   ├── ask
│   │   ├── bid
│   │   ├── bsize
│   │   ├── ex
│   │   ├── mode
│   │   ├── sym
│   │   └── time
│   └── trade
│       ├── cond
│       ├── ex
│       ├── price
│       ├── size
│       ├── stop
│       ├── sym
│       └── time
├── daily
├── exnames
├── master
└── sym

4 directories, 25 files

When you load a partitioned database with \l, q does not "load" the data. It simply maps the directory structure:

q)\l /tmp/kdbdb

You can run q-sql and SQL queries on the mapped kdb+ database:

q)select sum size by 0D00:10 xbar time from trade where date=last date
time                | size
--------------------| ------
0D09:30:00.000000000| 105690
0D09:40:00.000000000| 53574
0D09:50:00.000000000| 48170
0D10:00:00.000000000| 41788
0D10:10:00.000000000| 36279
..

In the above query:

  • q only looks inside the 2026.02.02 folder (ignoring all other days)
  • q only reads the size and time files (ignoring price, ex, etc.)

This technique allows you to analyze datasets that are much larger than your physical memory. You can query a 10TB database on a laptop with 16GB of RAM, provided you perform an aggregation or only ask for a subset of the columns and dates at any one time.

Performance

Kdb+ isn't just a database; it is fundamentally a vector processing engine. Its performance comes from its ability to treat data as contiguous blocks of memory, allowing it to leverage modern CPU features and massive parallelization.

Hardware acceleration (SIMD)

At its core, kdb+ is optimized for SIMD (Single Instruction, Multiple Data). This allows the CPU to perform the same operation (like addition or multiplication) on multiple data points in a single clock cycle. When you add two columns in q, you aren't just looping; you are engaging the hardware's vector lanes.

Parallel processing

Kdb+ can distribute workloads across multiple CPU cores. By starting your q process with the -s flag, you enable secondary threads:

$ q /tmp/kdbdb -s 4     # Enable 4 secondary threads for parallel execution

When you run an aggregation like sum or avg on a long vector, kdb+ automatically splits the vector into chunks, processes them in parallel across your cores, and combines the result (a "map-reduce" pattern). This also applies to partitioned data: kdb+ can scan multiple days of data simultaneously.

For even larger scales, you can use segmented databases to spread data across multiple physical disks. This enables parallel I/O, allowing you to read terabytes of data at the speed of your hardware's combined throughput — all without changing a single line of your q-sql code.

Attributes: the "secret sauce"

In traditional databases, you create indexes. In q, you apply attributes. These are metadata labels that tell the q engine about the structure of your data, allowing it to choose the fastest possible algorithm for a query, as these two examples show:

  • Sorted (s#): Applied to columns like time. It enables binary search (\(O(\log n)\)), making lookups nearly instantaneous.
  • Parted (p#): Typically used for the sym (ID) column in on-disk databases. It tells q that all identical symbols are stored in contiguous blocks. This allows q to jump straight to the start of a symbol's data and read it in one burst.

You can check the attributes of a table using the meta command. The a column below shows the parted attribute for sym:

q)meta trade
c    | t f a
-----| -----
date | d
sym  | s   p
time | n
price| f
size | j
stop | b
cond | c
ex   | s

By using the parted (p) attribute on sym, a query for a single ticker like select from trade where sym=`AAPL doesn't need to scan the whole sym vector; it knows exactly where the AAPL data starts and ends on the disk. Less I/O means faster queries!

Acting as a database

While this guide so far has used q primarily as a standalone analysis tool, its true power lies in its ability to act as a high-performance database server. By specifying a port with the -p parameter, you can enable network connectivity:

$ q /tmp/kdbdb -s 4 -p 5100

Once the process is listening, anyone with network access can connect to your session and query your data. Common ways to connect include:

The following sections begin exploring the first two options (but only scratch the surface of what is possible).

Connect from another q process

In a separate terminal, start a second q session. Use the hopen command to create a connection handle to the server:

q)h: hopen 5100     / Opens a connection to localhost:5100. 'h' is our "handle".

Now you can send commands through that handle. The simplest way is to pass a query as a string:

q)h "select nr: count i by sym from trade"
sym | nr
----| ----
AAPL| 1940
AIG | 1906
AMD | 1973
AMZN| 1934
..
q)

Sending strings is easy, but can be insecure (SQL injection) and inconvenient especially when parameters are passed. Instead, kdb+ encourages functional form. You define a function on the server, and the client calls it by passing the function name and arguments in a list.

On the Server:

/ Define a "Stored Procedure" to get basic stats for a specific symbol
q)getTradeStatOf: {[x] select nr: count i, sum size, avgprice: avg price from trade where sym=x}

On the Client:

q)h (`getTradeStatOf; `TSLA)    / Simpler and safer than string manipulation
nr   size   avgprice
--------------------
1914 103341 65.97574

Connect from a web browser

Every q process started with -p is also a lightweight web server. This is incredibly useful for quick inspections. If you navigate to http://localhost:5100 in your browser, you can see all the variables currently in memory. Click on a variable to see its content.

Connecting to q process from a browser

You can even execute queries directly from the URL bar by appending a ? followed by your q code:

Executing a query in a browser

Next steps