Send Feedback
Skip to content

11. I/O

11.0 Overview

I/O in q is a powerful and succinct feature of the language. As a word of warning, the names and behavior of the functions are idiosyncratic but the economy of expression is unrivaled.

I/O is realized via handles, which are symbolic names of resources such as files or machines on a network. One-and-done operations can be performed directly on the symbolic handle, e.g., you can read a file into memory in a single operation. For continuing operations, you open the symbolic handle to obtain an open handle. The open handle is a function that is applied to perform operations. When you have completed the desired operations, you close the open handle to free any allocated resources.

11.1 Data Files

In q, files come in two flavors: text and binary. Routines to process text data have 0 in their names, whereas routines to process binary data have 1. A text file is considered to be a list of strings – i.e., a list of char lists – and a binary file is a list of byte lists. While all text files can also be processed as binary data, not all binary data represents text. As mentioned above, file operations use handles.

11.1.1 File Handles

A file handle is a symbol or string that represents the name of a directory or file on persistent storage. A file handle starts with a colon : and has the form,

:[*path*]*name*

":[*path*]*name*"

where the bracketed expression represents an optional path and name is a file or directory name. The combination should be recognized as valid by the underlying operating system. Some q operations require that you append a trailing slash / to indicate that you mean a directory. We will point these out.

Note

It is generally easier to work with paths and names as strings so that blanks and other special characters can be handled easily. Also keeping the file handles as strings avoids the interning of the symbols. While hopen now accepts a file string, other h prefix functions such as hcount do not as of the time of this writing (July 2025). Consequently we shall continue to use symbols in this text.

If using symbols, `$ converts a string to a symbol but it can be awkward to include the leading: required in the symbolic handle. The built-in hsym, which inserts a leading colon into a symbol, serves this purpose.

q)hsym `$"/data/file name.csv"
`:/data/file name.csv

Note that q always represents separators in paths by the forward slash /, even when running on Windows. If you run q on Windows, you can type either / or \ but q displays / in its response.

Tip

To make life easier when generating symbolic paths dynamically, (hsym) is idempotent, meaning that it does nothing if a leading : is already present.

q)hsym hsym `$"/data/file name.csv"
`:/data/file name.csv

11.1.2 hcount and hdel

The first one-and-done operation that works directly on a symbolic file handle is hcount, which returns a long representing the size of the file in bytes as reported by the OS.

q)hcount `:Q4m/trades.q"
630

Tip

Advanced: In the case of a compressed file, hcount returns the uncompressed length, not the actual bytes as reported by the OS.

The next one-and-done is hdel, which instructs the OS to remove the file specified by its symbolic handle operand.

q)hdel `:Q4m/trash.txt
`:Q4m/trash.txt

It will complain on a non-empty directory but will remove an empty directory.

Some notes:

  • The return value of the symbolic file handle itself indicates that the deletion was successful. It should not be confused with an error message, which starts with a tick rather than a backtick.
  • You will get an error message if the file does not exist or if the delete cannot be performed.
  • You will not be prompted for confirmation. Back up any files that are important.

11.1.3 Serializing and Deserializing q Entities

Any normal q entity can be serialized and persisted to storage. Unlike traditional languages, where you must instantiate serializers and writers, things are simple and direct in q. This is because q data is self-describing, so that its internal representation can be written out as a sequence of bytes and then read directly back into memory. This is as close to the Star Trek transporter as we are likely to get.

Tip

Advanced: Dynamic loads, foreign keys and locked code can't be serialized.

The serialization magic is done by (an overload of) the binary set, whose left operand is a file handle and right operand is the entity to be written. When successful, the result is the symbolic handle of the written file. The file is automatically closed once the write is complete.

q)`:Q4M/data/a set 42
`:Q4M/data/q
q)`:Q4M/data/L set 10 20 30
`:Q4M/data/L
q)`:Q4M/data/t set ([] c1:`a`b`c; c2:10 20 30)
`:Q4M/data/t

A serialized q data file can be read using (an overload of) the unary get, whose argument is a symbolic file handle and whose result is the q entity contained in the data file.

Tip

The behavior of set is to create the file if it does not exist and overwrite it if it does. It will also create the directory path if it does not exist provided it has the appropriate permissions.

q)get `:Q4M/data/a
42
q)get `:Q4M/data/L
10 20 30
q)get `:Q4M/data/t
c1 c2
-----
a 10
b 20
c 30

An equivalent way to read a data file is with one of the overloads of value.

q)value `:Q4M/data/t
c1 c2
-----
a 10
...

Alternatively, you can use the command \l to load a data file into memory and assign it to a variable with the same name as the file. Here you do not use a file handle; rather, specify the path to the file without any decoration. In a fresh q session,

q)t
't
q)\l Q4M/data/t
`t
q)t
c1 c2
-----
a 10
...
q)system "l Q4M/data/t"
`t

11.1.4 Memory Mapping Basics (Advanced)

According to Google: "Memory-mapped files map a file's content to a process's virtual memory space, allowing direct access to the file as if it were a memory array, eliminating the need for traditional read/write calls. Benefits include improved performance through zero-copy access, automatic management of data caching by the operating system, and the ability to perform random access on extremely large files without loading the entire file into memory."

Simply put, q uses memory mapping on columns when tables that have been splayed (or partitioned) to storage are "loaded" into memory. As of q v3.6 an arbitrary column list can now be splayed. This represents a major advancement that supports historical tables of arbitrary complexity. The foundation for this lies in the behavior of set and get on lists which we describe here.

TL;DR

The takeaway from the following examples is that serializing a list to storage and then using get to read it maps the list – rather than loading it – into memory. This can result in significant improvements in memory utilization and execution speed.

The q memory mapping functionality is most beneficial for large tables so we use large lists in our examples. As our base case, we demonstrate resource utilization for an ordinary in-memory list. In a fresh q session we create a variable L holding 100,000,000 integers, perform the operation of taking the first 1000 items then serialize it using set. We check memory usage at every step using the custom function mem, defined as

q) mem:{-1 ssr/[;("\n";"| ");(" ";"|")] .Q.s -3_.Q.w[];}

Observe how performing the operation on the lists changes memory usage.

q)mem[]
used|360816 heap|67108864 peak|67108864 wmax|0 mmap|0
q)1000#100000000#til 10
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 ..
q)mem[]
used|360816 heap|1140850688 peak|1140850688 wmax|0 mmap|0
q)L:100000000#til 10
q)mem[]
used|1074102640 heap|1140850688 peak|1140850688 wmax|0 mmap|0
q)1000#L
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 ..
q)mem[]
used|1074102640 heap|1140850688 peak|1140850688 wmax|0 mmap|0
q)`:data/L set L
`:data/L
q)\ls data
,"L"
q)\ts 1000#L
3 8384
q)\ts 1000#L
0 8384

Also note that the execution time for 1000# dropped to 0 on the second repetition due to the data being cached .

Now in a fresh q session we perform the same 1000# operation on the get of the serialized file instead of creating a list in memory.

q)mem[]
used|360816 heap|67108864 peak|67108864 wmax|0 mmap|0
q)1000#get `:data/L
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 ..
q)mem[]
used|362400 heap|67108864 peak|67108864 wmax|0 mmap|0
q)type get `:data/L
7h

Observe that operating on the result of get on the serialized file does not affect memory usage even though it looks like a normal list to (type).

Continuing in another fresh q session, observe the time of execution is actually slower on the first execution than the in-memory list previously and also drops to 0 on the second execution.

q)\ts 1000# get `:data/L
1 10048
q)\ts 1000# get `:data/L
0 8464
q)\ts 1000# get `:data/L
0 8464

Next we perform the same sequence on a compound list – i.e., a list of simple lists of the same type. While the lists have the same number of rows as before, memory usage is larger due to the nesting. In a fresh q session,

q)mem[]
used|360816 heap|67108864 peak|67108864 wmax|0 mmap|0
q)1000#100000000#(1 2; 3 4)
1 2
3 4
..
q)mem[]
used|360816 heap|1140850688 peak|1140850688 wmax|0 mmap|0
q)Lc:1000#100000000#(1 2; 3 4)
q)mem[]
used|369072 heap|1140850688 peak|1140850688 wmax|0 mmap|0
q)1000#Lc
1 2
3 4
..
q)mem[]
used|369072 heap|1140850688 peak|1140850688 wmax|0 mmap|0
q)`:data/Lc set Lc
`:data/Lc
q)\ls data
"Lc"
"Lc#"

We observe that in this case set creates two files, the second with a # suffix is due to the nesting.

Continuing in a fresh q session, we again observe that memory usage is not affected by the operation on the get result. Also notice the type has changed to 77h indicating that the file is mappable.

q)mem[]
used|360816 heap|67108864 peak|67108864 wmax|0 mmap|0
q)1000# get `:data/Lc
1 2
3 4
..
q)mem[]
used|362400 heap|67108864 peak|67108864 wmax|0 mmap|0
q)type get `:data/Lc
77h

Next, in a fresh q session we observe the size or complexity of the stored column has had little effect on the timing. Also we see a (significant) timing decline upon repeated reference to the unassigned get result.

q)\ts get `:data/Lc
1 7600
q)
q)\ts get `:data/Lc
0 592

Finally we perform the same sequence on a mixed list having an integer, a dictionary and a table as items This was strictly off-limits in earlier versions of q.

q)mem[]
used|360816 heap|67108864 peak|67108864 wmax|0 mmap|0
q)1000#100000000#(42; ([a:1; b:2]); ([] c1:10 20; c2:11 2.2))
42
`a`b!1 2
+`c1`c2!(10 20;11 2.2)
..
q)mem[]
used|360816 heap|1140850688 peak|1140850688 wmax|0 mmap|0
q)Lm:100000000#(42; ([a:1; b:2]); ([] c1:10 20; c2:11 2.2))
q)mem[]
used|1074102928 heap|2214592512 peak|2214592512 wmax|0 mmap|0
q)1000#Lm
42
`a`b!1 2
+`c1`c2!(10 20;11 2.2)
..
q)mem[]
used|1074102928 heap|2214592512 peak|2214592512 wmax|0 mmap|0
q)`:data/Lm set Lm
`:data/Lm
q)\ls data
,"Lm"
"Lm#"
"Lm##"

In this case set created three files, the second with a # suffix and the third with ## which is due to nested symbols.

Continuing in a fresh q session, we again observe that memory usage is not affected by the operation on the get result and the type is 77h indicating that the file is mappable.

q)mem[]
used|360816 heap|67108864 peak|67108864 wmax|0 mmap|0
q)1000# get `:data/Lm
42
`a`b!1 2
+`c1`c2!(10 20;11 2.2)
42
..
q)mem[]
used|362400 heap|67108864 peak|67108864 wmax|0 mmap|0
q)type get `:data/Lm
77h

In another fresh q session we observe that the increased complexity of the stored column again has no noticeable effect on the timing. And again we see a drastic timing decline upon repeated reference to the unassigned get result.

q)\ts get `:data/Lm
2 4196128
q)\ts get `:data/Lm
0 4194544

11.1.5 Binary Data Files

As with traditional languages, for continuing operations on a q data file, you open the file, perform the operation(s) and then close it. Unlike traditional languages, opening a symbolic handle returns a function, called an open handle, that is applied to perform operations.

As mentioned previously, q files come in two flavors, binary and text. Serialized q data persisted wit set is written in binary form with a header at the beginning of the file. It is instructive to read it as raw binary data to inspect its internals.

Open a data file handle with hopen, whose result is a function called the open handle. This function should be stored in a variable, traditionally h, which is functionally applied to data to write it to the file. We will explain the result of applying the open handle shortly. We begin with a file containing serialized q data and show how to append to it. We'll discuss the return value in a bit.

q)`:data/L set 10 20 30
`:data/L
q)h:hopen `:data/L
q)h[42]
7i
q)h 100 200
7i

Tip

Always apply hclose to the open handle to close it and flush any data that might be buffered. Failure to do so may cause your program to run out of file handles unnecessarily.

We verify that the appends have been made.

q)hclose h
q)get `:data/L
10 20 30 42 100 200

We can also create a new file and write raw binary data to it. We'll see how to read this file in a bit but here is a sneak peek.

q)h:hopen `:data/raw
q)h[42]
6i
q)h 10 20 30
6i
q)hclose h
q)read1 `:data/raw
00000000a0000000000000014000000000000001e00000000000000

Now, what is the deal with the 8i return value of applying the open handle, whose values may vary for each invocation?

q)h:hopen `:/data/raw
q)h 43
8i

In fact, the return value is the value of the open handle itself.

q)h
9i

Surely, you say, we can’t use an int as a function to write data. But you would be wrong…and I'm still not Shirley.

q)hopen `:data/new
8i
q)8i[1 2 3]
8i
q)hclose 8i
q)read1 `:data/new
0x010000000000000002000000000000000300000000000000

Tip

Apparently q assigns an int to each open file and keeps track of which int values are valid handles. This accounts for the cryptic error message when you attempt to use variables with simple list notation.

q)a:42
q)b:43
q)a b
'Cannot write to handle 42. OS reports: Bad file descriptor

11.1.6 Writing and Reading Binary

Apply read1 on a file handle to read any file into q as a list of bytes. For example, we can a the previously serialized value L as bytes.

q)`:data/L set 10 20 30
q)read1 `:data/L
0xfe2007000000000003000000000000000a0000000000000014000000000000001e..

This shows the internal representation of the serialized q entity with the header followed by data. How cool is that?

If you want to write raw binary data, as opposed to the internal representation of a q entity containing the data, use the infelicitously named 1:. It takes a symbolic file handle as its left argument and a simple byte list as its right argument. Bytes in the right operand are streamed to the file.

q)`:data/answer.bin 1: 0x06072a
`:data/answer.bin
q)read1 `:data/answer.bin
0x06072a

11.1.7 Using Apply Amend

Fundamentalists can use Apply Amend in place of set to serialize q entities to files. To write the file, or overwrite an existing file, use assign (:).

q).[`:data/raw; (); :; 1001 1002 1003]
`:data/raw
q)get `:data/raw
1001 1002 1003

To append to an existing file use (,).

q).[`:data/raw; (); ,; 42]
`:data/raw
q)get `:data/raw
1001 1002 1003 42

11.2 Save and Load on Tables

We have already seen that it is easy to write and read tables to/from persistent storage.

q)`:data/t set ([] c1:`a`b`c; c2:10 20 30; c3:1.1 2.2 3.3)
`:data/t
q)get `:data/t

c1 c2 c3
---------
a 10 1.1
b 20 2.2
c 30 3.3

The save and load functions make this even easier. In its simplest form, save serializes a table in a global variable to a binary file having the same name as the variable. It overwrites an existing file.

q)t:([] c1:`a`b`c; c2:10 20 30; c3:1.1 2.2 3.3)
q)save `:data/t
`:data/t
q)get `:data/t
c1 c2 c3
---------
a 10 1.1
b 20 2.2
c 30 3.3

This is equivalent to using set above with the table name as file name.

As you might expect, load is the inverse of save meaning that it reads a serialized table from a file into a variable with the same name as the file. It creates the variable in the workspace or overwrites it if it already exists.

In a fresh q session after t has been saved as above,

q)t / t doesn't exist
't
q)load `:data/t / now it does
`t
q)t

c1 c2 c3
---------
a 10 1.1
...

You can also use save to write a table to a text file. You determine the format of the text with the file extension in the file handle.

Tip

The following examples of save are special versions of the more general (0:) – see 11.5.

Save a table with .txt extension to obtain tab-delimited records. There is no corresponding load but you can parse the text file – see 11.5.1. Here we display the actual file records with (read0).

q)save `:data/t.txt
`:data/t.txt
q)read0 `:data/t.txt
"c1\tc2\tc3"
"a\t10\t1.1"
"b\t20\t2.2"
"c\t30\t3.3"

Save the table with .csv extension to obtain comma-separated values. There is no corresponding load but you can parse the csv file – see 11.5.2.

q)save `:data/t.csv
`:data/t.csv
q)read0 `:data/t.csv
"c1,c2,c3"
"a,10,1.1"
"b,20,2.2"
"c,30,3.3"

Save the table with .xml extension obtain xml records. There is no direct way to read xml into q although libraries have been contributed – see code.kx.com.

q)save `:data/t.xml
`:data/t.xml
q)read0 `:data/t.xml
"<R>"
"<r><c1>a</c1><c2>10</c2><c3>1.1</c3></r>"
"<r><c1>b</c1><c2>20</c2><c3>2.2</c3></r>"
"<r><c1>c</c1><c2>30</c2><c3>3.3</c3></r>"
"</R>"

Save the table with .xls extension obtain an Excel spreadsheet. This file can be loaded by Excel work-alikes.

q)save `:data/t.xls
`:data/t.xls

11.3 Splayed Tables

We have already seen how to persist a table by serializing it to a file using set. There are no restrictions on the types of columns in the table or the file name in this scenario.

q)`:data/t set ([] c1:`a`b`c; c2:10 20 30; c3:1.1 2.2 3.3)
`:data/t
q)get `:data/t
c1 c2 c3
---------
a 10 1.1
b 20 2.2
c 30 3.3

This creates a single file, as the OS verifies.

q)\ls -l data/t
"-rw-r--r-- 1 jaborror staff 98 Jul 1 10:41 Q4M/data/t"

For larger tables that may not fit into memory on all machines, you can ask q to serialize each column of the table to its own file in a specified directory. A table persisted in this form is called a splayed table. The advantage is that when a splayed table is loaded it is actually memory mapped. Querying a splayed table results in only the columns referred to in the query being loaded into memory. This can be a substantial memory win for a table having many columns and queries using only a few columns.

Aside: It is worthwhile looking up the origin of the English word "splay". Also, please don't spay your tables.

To splay a table, use set and specify a directory as the target location indicated by a trailing slash / in the left operand. We start with a numeric table.

q)`:data/tsplay/ set ([] c1:10 20 30; c2:1.1 2.2 3.3)
`:data/tsplay/

List the directory in the OS and you will see a directory tsplay that contains three files, one file for each column in the original table, as well as a hidden .d file.

q)\ls -l -d data/tsplay
"drwxr-xr-x 5 jaborror staff 160 Jul 1 10:46 Q4M/data/tsplay"

q)\ls -l -a data/tsplay
"total 24"
"drwxr-xr-x 5 jaborror staff 160 Jul 1 10:46 ."
"drwxr-xr-x 13 jaborror staff 416 Jul 1 10:46 .."
"-rw-r--r-- 1 jaborror staff 14 Jul 1 10:46 .d"
"-rw-r--r-- 1 jaborror staff 40 Jul 1 10:46 c1"
"-rw-r--r-- 1 jaborror staff 40 Jul 1 10:46 c2"

Nearly all the metadata regarding the splayed table can be read from the file system – i.e., the name of table from directory and names of the columns from the files. The one missing bit is the order of the columns, which is stored as a serialized list in the hidden .d file.

q)get hsym `$"data/tsplay/.d"
`c1`c2

Important

There are restrictions on tables that can be splayed.
- In versions before q v3 6 columns were required to be simple or compound lists. Now general list columns can also be splayed.
- All symbol columns must be enumerated.

Thus the following succeed.

q)`:data/tok/ set ([] c1:2000.01.01+til 3; c2:1 2 3)
`:data/tok/
q)`:data/tok/ set ([] c1:1 2 3; c2:(1.1 2.2; enlist 3.3; 4.4 5.5))
`:data/tok/
q)`:data/toops/ set ([] c1:1 2 3; c2:(1;`1;"a"))
`:data/toops/

But the following fails.

q)`:data/toops/ set ([] c1:`a`b`c; c2:10 20 30)
'type

The convention for enumerating symbols in splayed tables is to enumerate all symbol columns in all tables over the domain sym and store the resulting sym list in the root directory – i.e., one level above the directory holding the splayed table. You can do this manually but no one does.

q)`:db/tsplay/ set ([] `sym?c1:`a`b`c; c2:10 20 30)
`:db/tsplay/
q)sym
`a`b`c
q)`:db/sym set sym
`:db/sym

Normally folks use one of the .Q utilities that used to be reserved for KX use but are now officially blessed---except if there is no documentation on the X Documentation site. For example, we use (.Q.en).

q)`:db/tsplay/ set .Q.en[`:Q4M/db; ([] c1:`a`b`c; c2:10 20 30)]
`:db/tsplay/

Here (.Q.en) prepares a table for splaying by enumerating all its symbol columns. The first argument is the symbolic file handle of the root directory for the persistent residence of the enumeration domain sym (no choice in the name). The second argument is a table. See 14.5.2 for more detail on its behavior.

For more flexibility you can use (.Q.ens) that allows the name of the symbol domain to be specified. In a fresh db,

q)`:db/tsplayx/ set .Q.ens[`:Q4M/db; ([] c1:`a`b`c; c2:10 20 30); `symx]
`:db/tsplayx/
q)system "ls db"
"symx"
,"tsplayx"

11.4 Text Data

We have seen that q views each record in a binary data file as a list of bytes. Similarly, a record in a text file is viewed as a list of char – i.e., a string. Thus a text file corresponds to list of strings when read and you pass a list of strings to write records to a text file.

11.4.1 Reading and Writing Text Files

Read a text file with the unary (read0) that takes a symbolic file handle argument. The result is a list of strings, one for each line in the file. e the file data/life.txt with content,

Life

The Universe

And Everything

using a text editor. Then we find,

q)read0 `:data/life.txt
"Life"
"The Universe"
"And Everything"

You can see the underlying binary values of the text by using (read1) or casting the result of (read0) to byte.

q)read1 `:data/life.txt
0x4c6966650a54686520556e6976657273650a416e642045766572797468696e670a

q)"x"$read0 `:data/life.txt
0x4c696665
0x54686520556e697665727365
0x416e642045766572797468696e67

You can read the data as binary and cast the result to char. Observe that the data is a simple list of char so the unescaped newline character does not cause line breaks in the console display.

q)"c"$read1 `:data/solong.txt
"Life\nThe Universe\nAnd Everything\n"

To write a string as text use the (infelicitously named) binary (0:), which takes a file handle in the left operand and a list of strings in the right operand. It creates the directory path if necessary and overwrites the file if it already exists.

q)`:data/life.txt 0: ("Life"; "The Universe"; "And Everything")
`:data/life.txt
q)read0 `:data/life.txt
"Life"
"The Universe"
"And Everything"

11.4.2 Using hopen and hclose

Just as with a binary data file, a symbolic text file handle can be opened with hopen. The result is again an int that is conventionally stored in the variable h and is applied as a function application to write data. The difference is that instead of using plain h to write binary data, you use neg[h] to write strings as text. Seriously.

More specifically, the behavior of neg[h] when its argument is a string is to append x,"\n". When x is a list of strings it appends x,'"\n" to the file. Aren't you happy you asked?

Observe that we pass lists of strings.

q)h:hopen `:data/this.txt
q)neg[h] enlist "This"
-3i
q)neg[h] ("and"; "that")
-3i
q)hclose h
q)read0 `:data/this.txt
"This"
"and"
"that"

Tip

Apply hclose to h, not to neg[h].

If the file already exists, opening with hopen and applying the open handle will append rather than overwrite.

q)h:hopen `:data/this.txt
q)neg[h] ("and"; "more")
-6i
q)hclose h
q)read0 `:data/this.txt
"This"
"and"
"that"
"and"
"more"

11.4.3 Preparing Text

We saw the built-in functions for saving tables as text files in 11.2. When you need to control the filename, you can write the table yourself with 0:, but then you must prepare the table columns as formatted text. A separate overload of 0: is available for this purpose. A confusing naming convention, to say the least.

In this overload 0: has left operand a char delimiter and right operand a table or list of columns. Observe the use of the pre-defined constant csv, which is simply ",".

q)t:([] c1:`a`b`c; c2:1 2 3)
q)"\t" 0: t
"c1\tc2"
"a\t1"
"b\t2"
"c\t3"
q)"|" 0: t
"c1|c2"
"a|1"
"b|2"
"c|3"
q)csv
"," 
q)csv 0: t
"c1,c2"
"a,1"
"b,2"
"c,3"
q)`:data/t.csv 0: csv 0: t
`:data/t.csv
q)read0 `:data/t.csv
"c1,c2"
"a,1"
"b,2"
"c,3"

In the last snippet we applied 0: with two different meanings: to prepare and then write text. We hope you've grown fond of this name, since 11.5 will introduce yet another version of (0:) for parsing text records.

11.5 Parsing Records

Binary forms of 0: and 1: parse individual fields from text or binary records. Field parsing is based on the following field types.

0 1 Type Width(1) Format(0)
B b boolean 1 [1tTyY]
X b byte 1
G g guid 16 XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
H h short 2 [0-9a-fA-F][0-9a-fA-F]
I i int 4
J j long 8
E e real 4
F f float 8
C c char 1
S s symbol n
P p timestamp 8 date?timespan
M m month 4 [yy]yy[?]mm
D d date 4 [yy]yy[?]mm[?]dd or [m]m/[d]d/[yy]yy
Z z datetime 8 date?time
N N Timespan 8 hh[:]mm[:]ss[[.]ddddddddd]
U u minute 4 hh[:]mm
V v second 4 hh[:]mm[:]ss
T t time 4 hh[:]mm[:]ss[[.]ddd]
blank skip
* literal chars

The column labeled 0 contains the (upper case) field type char for text data. The (lower case) char in column 1 is for binary data. The column labeled Width(1) contains the number of bytes that will be parsed for a binary read. The column labeled Format(0) displays the format(s) that are accepted in a text read.

Tip

The parsed fields from multiple records are returned in column form rather than row form to make it easy to associate a list of symbolic names with (!) and then flip into a table.

11.5.1 Fixed Width Records

The binary form of 0: and 1: for reading fixed length files is,

(Lt;Lw) 0:f

(Lt;Lw) 1:f

The left operand is a nested list containing two items: Lt is a simple list of type chars, one letter per field; Lw is a simple list of int containing one integer width per field. The sum of the field widths in Lw should equal the width of the record. The result of the function is a list of lists, one list arising from each record.

We demonstrate 0: here since it is more commonly used; 1: works analogously. The simplest form of the right operand f is a symbolic file handle. For example, suppose we have a file with records of the form,

1001 98.000ABCDEF1234Garbage2025.01.01
1002 42.001GHUJKL0123Garbage2025.01.02
1003 44.123nopqrs9876Garbage2025.01.03

We could parse the records of the file with,

q)("JFS D";4 8 10 7 10) 0: `:data/fixed.txt
1001 1002 1003
98 42.001 44.123
ABCDEF1234 GHUJKL0123 nopqrs9876
2025.01.01 2025.01.02 2025.01.03

This reads a text file containing fixed length records of total width 39. The first field is a long occupying 4 positions; the second field is a float occupying 8 positions; the third field consists of a symbol occupying 10 positions; the fourth slot of 7 positions is ignored; the fifth field is a date occupying 10 positions.

You might think that the widths are superfluous, but they are not. The actual data width can be narrower than the normal size due to small values, as in our case of the long field. Or you may need to specify a width larger than that required by the corresponding data type due to whitespace in the fields, as in the case of our float field.

Observe how easy it is to make a table from the result.

q)flip `c1`c2`c3`c4!("JFS D";4 8 10 7 10) 0: `:data/fixed.txt
c1 c2 c3 c4
---------------------------------
1001 98 ABCDEF1234 2025.01.01
1002 42.001 GHUJKL0123 2025.01.02
1003 44.123 nopqrs9876 2025.01.03

Also note that it is possible to parse a list of strings using the same format, since they represent text records in memory.

q)("JFS D";4 8 10 7 10) 0: read0 `:data/fixed.txt
1001 1002 1003
98 42.001 44.123
...

The more general form for the right operand f is,

(hfile;i;n)

where hfile is a symbolic file handle, i is the offset into the file to begin reading and n is the number of bytes to read. This is useful for sampling a file or for chunking large files that cannot be read into memory in a single gulp.

Tip

A read operation should begin and end on record boundaries or you may get meaningless results.

In our trivial example, the following reads just the second and third records,

q)("JFS D";4 8 10 7 10) 0: (`data/fixed.txt; 40; 80)
1002 1003
42.001 44.123
GHUJKL0123 nopqrs9876
2025.01.02 2025.01.03

11.5.2 Variable Length Records

The binary form of 0: for reading variable length, delimited files is,

(Lt;D) 0:f

The left operand is a list comprising two items. Lt is a simple list comprising one type letter per corresponding field. The second item D is either a char representing the delimiting character or an enlisted char.

Specify D as an atomic delimiter char when the first record of the file does not contain column names. In this case, the result of the parse is a list of column lists, each of which contains items of type specified by Lt. The simplest form of the right operand f is a symbolic file handle.

For example, say we have a comma-separated file data/simple.csv having records,

1001,DBT12345678,98.6
1002,EQT98765432,24.75
1004,CCR00000001,121.23

Parsing with a delimiter char "," results in a list of column lists. As with parsing fixed format recodes, it is easy to make the result into a table.

q)("JSF"; ",") 0: read0 `:data/simple.csv
1001 1002 1004
DBT12345678 EQT98765432 CCR00000001
98.6 24.75 121.23

q)flip `c1`c2`c3!("JSF"; ",") 0: read0 `:data/simple.csv
c1 c2 c3
-----------------------
1001 DBT12345678 98.6
...

Observe that it is possible to retrieve the second field as a string instead of a symbol using "*" as the data type specifier,

q)("J*F"; ",") 0: read0 `:data/simple.csv
1001 1002 1004
"DBT12345678" "EQT98765432" "CCR00000001"
98.6 24.75 121.23

Specify D as an enlisted char when the first record contains a separated list of names. Subsequent records are read as data specified by the types in Lt. The result is a table in which the column names are taken from the first record.

Say we have a comma-separated file data/titles.csv having records,

id,ticker,price
1001,DBT12345678,98.6
1002,EQT98765432,24.7
1004,CCR00000001,121.23

Reading with an enlisted "," delimiter results in a table.

q)("JSF"; enlist ",") 0: `:data/titles.csv
id ticker price
-----------------------
1001 DBT12345678 98.6
1002 EQT98765432 24.7
1004 CCR00000001 121.23

11.5.3 Key-Value Records

The operator 0: can also be used to process text representing key-value pairs. In this situation, the left operand is a three-character string Pf that specifies the pair format. The first char of Pf can be "S" to indicate the key is a string or "I" to indicate the key is an int or "J" to indicate it is a long. The second char indicates the key-value separator. The third char indicates the pair delimiter.

The following examples illustrate various combinations in Pf.

q)"S=;" 0: "one=1;two=2;three=3"
one two three
,"1" ,"2" ,"3"
q)"S:/" 0: "one:1/two:2/three:3"
one two three
,"1" ,"2" ,"3"
q)"J=;" 0: "1=one;2=two;3=three"
1 2 3
"one" "two" "three"

Again it is easy to make the result into a table.

q)flip `k`v!"J=;" 0: "1=one;2=two;3=three"
k v
---------
1 "one"
2 "two"
3 "three"

11.6 Compression

The topic of data compression is complex and is beyond the scope of this tutorial. We refer to the reader to the topic File Compression in the KX Documentation for full details. We present some some of the highlights here.

11.6.1 Writing compressed files

The basic way to write q entities as compressed data is to use the overload of set that accepts compression parameters specified by three integers representing logical block size, algorithm, and compression level.

q)`:a set 1000#enlist asc 1000?10
`:a
q)(`:za;17;2;9)set get`:a
`:za
q)get[`:a]~get`:za
1b

Algorithm and compression level are as follows.

alg algorithm level since
-------------------------
0 none        0
1 q IPC       0
2 gzip        0-9
3 snappy      0      V3.4
4 lz4hc       0-16   V3.6
5 zstd        7-22   V4.1

Since q operators read both compressed and uncompressed files, files that do not compress well, or have an access pattern that does not perform well with compression, can be left uncompressed.

You can set q to write compressed files by default by setting the zip defaults .z.zd. Set this as an integer triple.

.z.zd:17 2 6

Then (set) will write files (with no extension) compressed in this way unless given explicit parameters. To disable compression by default, set .z.zd to 3#0, or expunge it (see 12.5).

11.6.2 Reading compressed files

Column files are mapped or unmapped on demand during a query. Only the areas of the file that are touched are decompressed – i.e. kdb+ uses random access. Decompressed data is cached while a file is mapped. Columns are mapped for the duration of the select.

For example, say you are querying by date and sym over a daily-partitioned table, with each partition parted by sym. The query decompresses only the parts of the column data for the syms passing the query predicate.

kdb+ allocates enough memory to decompress the whole vector, regardless of how much it finally uses. This reservation is required as there is no backing store for the decompressed data, unlike with mapped files of uncompressed data, which can always read the pages from file again should they have been dropped.

This is reservation only, and can be accommodated by increasing the swap space available: even though the swap should never actually be written to, the OS has to be assured that in the worst-case scenario of decompressing the data in full, it could swap it out if needed.

See File Compression on the KX Documentation website for full details on:

  • Compression parameters
  • Selective compression
  • Compression statistics
  • Compression by default
  • Decompression
  • Concurrently open files
  • Memory allocation
  • Performance
  • Benchmarking
  • Kernel settings
  • Multithreading
  • Requirements
  • Gzip, Snappy
  • LZ4
  • Zstd

11.7 Data At Rest Encryption (DARE)

The topic of data security and encryption is complex and is beyond the scope of this tutorial. We refer the reader to the topic Data At Rest Encryption (DARE) on the KX Documentation website for full details. We provide their brief summary here.

As of q4.0, support is provided for TDE which is generally superior to FDE for meeting compliance requirements such as PCI-DSS. This provides the following:

  • As TDE decrypts the data inside the kdb+ process, rather than at the OS/storage level, data remains encrypted when it comes across the wire from remote storage.
  • Encryption is selective – encrypt only the files that need encrypting.
  • Files can be archived, or copied, across environments without going through a decryption and encryption cycle.
  • kdb+ is multi-platform, and as the file format is platform-agnostic, the same encrypted files can be accessed from multiple platforms.
  • Maintain key and process ownership and separation of responsibilities: the DBA holds TDE keys, the server admin holds FDE keys.

11.8 Interprocess Communication

The ease with which a q process can communicate with another q process residing on the network is one of the most impressive features of q. We shall cover the basics of interprocess communication (IPC) so that you can reproduce the section on callbacks in Chapter 1.

IPC limits: The number of concurrent ipc/websocket connections is no longer limited to 1022; it is limited only by system and protocol. Also the IPC message size limit has increased from 2GB to 1TB

We shall use the following terminology. The process that initiates the communication is called the client, while the process receiving and processing requests is the server. The server process can be on the same machine, the same network, a different network or on the Internet, so long as it is accessible. The communication can be synchronous (wait for a result to be returned) or asynchronous (don't wait and no result returned).

The only way to learn IPC is to do it, and the easiest way to do this is to set up two processes on your machine. We recommend you use the machine running your q sessions for this tutorial, provided it will allow a port to be opened. In what follows, we shall assume that a server q process has been started on a machine with an open port.

>q -p 5042 / server process
q)

The client process is a separate q process running on the same machine.

>q / client process
q)

11.8.1 Communication Handle

Symbolic communication handles look similar to file handles but they specify resources on the network. A communication handle has the form,

`:[*server*]:*port*

Here the bracketed expression represents an optional server machine identifier and port is a port number. An omitted server specification, or one of the form localhost, refers to the machine on which the originating q session lives. The following both refer to port 5042 on the same machine as the q session in which they are entered.

`::5042

`:localhost:5042

You can refer to a machine on the network by name. For example, on the author's laptop the following is equivalent to the two previous network handles.

`:unalome:5042

You can use the IP address of a machine.

`:198.162.0.2:5042

Finally, you can also use a url.

`:http://www.myurl.com:5042

11.8.2 Opening a Connection Handle

As with a file handle, apply hopen to a communication handle to obtain an open connection handle that is used as a function. As before, the value is an int that is traditionally stored in the variable h. Also as with file I/O, the behavior of this function differs between using the original positive handle or its negation.

Let's see how this works with your two sessions. (You did start them, didn't you?). Remember, the session that opened port 5042 is the server; the other session is the client. In the client session, open a handle to the server and store it in h, then apply h to the string as shown. Finally close the connection handle.

q)h "a:6*7"
q)h "a"
42
q)hclose h

Tip

Whitespace between h and the quoted string is optional, as this is simply function juxtaposition. We include it for readability.

As you have no doubt realized, the application of h sent the string to the server to be evaluated. On the server, we see,

q)a 
42 

11.8.3 Remote Execution

We have seen that when you open a connection to a q process, you have the full capability of that process available remotely. Apply the connection handle to any q expression in a string and it will be evaluated on the server. As you contemplate the IPC Zen, a dark cloud passes over your tranquility. You realize that, by default, the server is wide open.

Important

Allowing quoted q strings to be executed on a server makes the server susceptible to all manner of breaches. Good practice does not permit this on a production server. You can mitigate this by having your server process accept only requests whose first item is a symbol (see below), which you should verify is the name of a function you have vetted for exposure.

An alternative format for remote execution is to apply the connection handler to a list of the form,

(f;arg1;arg2;...)

Here f is a client-side expression that evaluates to a map that will be applied on the server. It can be:

  • The value of, or variable associated to, a map on the client
  • The symbolic name of a map on the server or a string that evaluates to a map on the server.

We use the term map here to be any q expression that can be evaluated as function application – e.g., a list on an index, a dictionary on a key or a function on an argument. Commonly f is a function.

The remaining items arg1, arg2, ... are optional values sent along to the server for the evaluation. These are arguments when f is a function, indices when it is a list, or keys when it is a dictionary.

Application of the connection handle to such a list sends the list to the server where it is evaluated. Any result is sent back to the client, where it is presented as the result of the connection handle application. Simply applying the naked handle makes this sequence of steps synchronous, meaning that execution of the q session on the client blocks until the result of the server evaluation is returned.

Our examples will cover the case when f is of function type since that is most common. We first consider the case when f is a map on the client side. In this situation the function (or list, dictionary, etc.) is actually transported to the server along with the supplied arguments, where it is applied.

On the client in our two-session setup:

q)h:hopen`::5042 / client
q)h ({x*y}; 6; 7)
42
q)f:{x*y}
q)h (f; 6; 7)
42
q)hclose h

Before you get too enamored of this form, we point out the limitations that disqualify it from production use. First, global variables referred to in the transported function will need to be present remotely in the exact contexts in effect when the function was defined. This can be avoided by restricting f to be a pure function that does not refer to any global entities. This is easier than you might think since you can put all the referenced variables in a dictionary and send it along with f. This is a poor person's closure.

Important

Allowing a function to be sent to the server for remote execution is as dangerous as sending quoted q strings. The function can access resources on the server and instigate an attack. Good practice does not permit this in production environments.

The remaining format for remote execution can be made safe for production environments. The function to be executed remotely must already be defined on the server and you pass its name and arguments via the connection handle.

On the server,

q)g:{x*y} / server

Continuing the session on the client,

q)h (`g; 6; 7) / client
42

Now consider the case when the remote function performs an operation on a table and returns the result. This is the q analogue of a remote stored procedure. For example, suppose t and f are defined on the server as,

q)t:([] c1:`a`b`c; c2:1 2 3) / server
q)f:{[x] select c2 from t where c1=x}

Now "call" the function f remotely from the client.

q)h (`f; `b)
c2
--
2

The difference from SQL stored procedures is that the remote procedure call can be any q function on the server, making the full power of q available remotely.

11.8.4 Synchronous and Asynchronous Messages

The IPC in the previous sections was synchronous, meaning that upon application of the connection handle, the client process blocks, waiting for a result from the server before proceeding. The value returned from the server becomes the return value of the open handle application.

Under the covers, IPC is implemented as messages passed over an open connection between q processes. When the positive open handle is applied to an argument, the message passing is synchronous, meaning that the following steps occur in sequence.

The client sends a message containing the argument(s) of the handle application to the server and waits for a return message.

The server receives the message, interprets it as the appropriate function application and obtains the result.

The server sends a message containing the result back to the client.

The client receives the result and resumes execution from the point it left off.

When a client sends multiple messages to a server in synchronous message passing, the next message is not sent until the result of the previous message is received. Consequently the messages always arrive at the server in the order in which they are sent. Also, the results from the server arrive back at the client in the order in which the original messages were sent.

It is also possible to perform asynchronous IPC in q. In this case the message is sent to the server and execution on the client continues immediately. In particular, there is no return value from the server. This is useful to initiate a task on the server when you don't care about the result. For example, you could initiate a long running operation, or you could send a message that the server will route to other processes.

Use the negation of the open connection handle to send an asynchronous message to the server. Let's define an instrumented function on the server to demonstrate what is happening.

q)sq:{0N!x*x} / server

Now invoke sq asynchronously from the client

q)neg[h] (`sq; 5) / client
q)

You will observe 25 displayed on the server console. Also, the client session returns immediately with no return value. The expression on the console actually has a nil value (::) that is suppressed by the console display.

Important

When sending asynchronous messages, always send an empty sync "chaser" message immediately before applying hclose to the open handle. If you do not do this, buffered messages may not be sent when the connection is closed.

In order to convince ourselves that the client actually does return immediately without waiting for a return from the server, we wrap the client expression in a function. Observe that the client continues with the next statement.

q){neg[h] (`sq; 5); 42}[] / client
42

Because a q session is single threaded by default, the server will process messages in the order in which they are received. However, in asynchronous messaging there is no guarantee that the messages arrive at the server in the order in which they are sent. It can be difficult to observe indeterminacy in simple examples, but you must assume that it will occur in practice.

11.7.5 Processing Messages

Assuming that you have passed the server either a function from the client side or the name of a function on the server side, the appropriate function is evaluated on the server. During evaluation, the communication handle of the remote process is available in the system variable .z.w ( "who" called). For an asynchronous call, this can be used to send messages back to the client during the function application on the server.

Tip

Both the client and the server have connection handles when a connection between them is opened. These handles are assigned independently and their int values are not equal in general.

Here is a simple example showing how to use .z.w to send a message back to the client. On the server, we define a function that displays its received parameter and then asynchronously calls mycallback with the passed argument incremented.

q)f:{show "Received ",string x; neg[.z.w] (`mycallback; x+1)}

On the client we define mycallback to display its parameter on the console. Then we make an asynchronous call to the function f on the server with an argument of 42.

q)mycallback:{show "Returned ",string x;}

q)neg[h] (`f; 42)
q)"Returned 43"

The result is that "Received 42" is displayed on the server console and "Returned 43" is displayed on the client console. Congratulations! We have just implemented callbacks in q.

Tip

When performing asynchronous messaging, always use neg[.z.w] to ensure that all messages are asynchronous. Otherwise you will get a deadlock as each process waits for the other.

You can override the default behavior of message processing in q by assigning your own handler(s) to the appropriate system variables. Assign your function to the variable

.z.pg to trap and process synchronous messages
.z.ps to trap and process asynchronous messages.

The names end in g and s because synchronous processing has "get" semantics and asynchronous processing has "set" semantics.

In the following we set the asynchronous handler to a trivial function, essentially ignoring asynchronous calls.

On the server,

q).z.ps:{show "ignore remote call"} / server

On the client send an asynchronous message.

q)neg[h] "6*7" / client

This results in “ignore remote call” being displayed on the server console.

Now we set the synchronous handler to a function that only accepts "safe" remote calls by function name. It then performs a protected evaluation on the function with the arguments passed, thus ensuring that a failed application does not disrupt the server.

On the server,

q).z.pg:{$[-11h=type first x; .[value first x; 1_x; ::]; `unsupported]}

Now send synchronous messages from the client.

q)h (`sq; 5)
25
q)h (`sq; `5)
"type"
q)h "6*7"
`unsupported
q)h ({x*y};6;7)
`unsupported

You can also specify handlers for

.z.po to be called upon connection open 
.z.pc to be called upon connection close.   

The connection handle of the sending process is passed as the lone argument to the functions assigned to .z.po and to .z.pc.

Note

The receiver of a synchronous call can respond asynchronously. For details look up "deferred response" or (-30!) on the KX site.

Here is a simple example that tracks connections and allows client processes to register callbacks with the server. Start a fresh q session on the server and open port 5042. Create a keyed table called Registry and define a function that can be invoked remotely to register a callback. Attach a handler to .z.po that initializes a dummy entry in Registry for the connection being opened and attach a handler to .z.pc to remove the record when a connection is closed.

q)Registry:([zw:`int$()]) callback:`symbol$()

q)register:{[cb] `Registry upsert (.z.w; cb);} 

q).z.po:{`Registry upsert (x; `unregistered);} 
q).z.pc:{delete from `Registry where zw=x;}

Start a fresh q session on the client and connect to the server.

q)h:hopen`::5042 / client

We check that an item has been entered into Registry on the server.

q)Registry / server
zw| callback
--| ------------
6 | unregistered

Next we register the name of a callback function from the client. Note the asynchronous message.

q)neg[h] (`register; `mycallback) / client

Again we check Registry on the server and observe that our callback name has indeed been registered.

q)Registry / server
zw| callback
--| ----------
6 | mycallback

Finally, we close the connection on the client.

q)hclose h / client

And observe that the client has been automatically unregistered.

q)Registry
zw| callback
--| --------

11.8.6 Remote Queries

In this section, we demonstrate how to execute q-sql queries against a remote server. First, we splay a table to stand for a time series database. We use the mktrades script that we created in §9.3.1 to create a trades table with 1,000,000 rows and then splay it to disk.

q)trade:mktrades[`aapl`goog`ibm; 1000000]
q)(`:db/trade/) set .Q.en[`:Q4M/db;] trade
`:db/trade/

Now start a fresh server process (the server), open a port, say 5042, and map the splayed trade table into memory. Check that the mapping succeeded by running a query.

q)\p 5042 / server
q)\l db

q)select from trade where dt=2025.01.01,sym=`ibm
dt         tm           sym qty  px
----------------------------------------
2025.01.01 00:00:05.754 ibm 6650 255.609
2025.01.01 00:00:20.148 ibm 2910 252.558
..

Leave the server process running and start another fresh process (the client), open a connection to the server and send the same query to the server for remote execution.

q)h:hopen`::5042 / client
q)h "select from trade where dt=2025.01.01,sym=`ibm"
dt         tm           sym qty  px
----------------------------------------
2025.01.01 00:00:05.754 ibm 6650 255.609
2025.01.01 00:00:20.148 ibm 2910 252.558
..

We have already pointed out that allowing remote execution of arbitrary strings is bad practice because it exposes the server to injection attack. So here is a simplistic example of a "safe" function that can be used as a stored procedure. It takes a symbolic table name, a list of parameterized symbolic column names for the select phrase, a date range for the where phrase. Enter on the server:

q)extract:{[tn;cnms;dtrng] ?[tn;enlist (within;`dt; dtrng);0b;cnms!cnms]}

Now on the client we (synchronously) call the stored procedure by name with appropriate arguments.

q)h (`extract;`trade;`dt`tm`sym`qty`px;2025.01.01 2025.01.02)
dt         tm           sym  qty  px
------------------------------------------
2025.01.01 00:00:01.530 goog 1530 159.5825
2025.01.01 00:00:05.754 ibm  6650 255.609
..

Tip

In an actual application you would validate the input parameters and wrap the core evaluation in protected evaluation to trap unanticipated errors. You would also want to implement an entitlements system – e.g., LDAP, Kerberos, keycloak, Azure AD, etc.

11.9 Special Handles

We have seen that q creates numeric handles that are applicable as functions when performing I/O file and IPC. Here we will describe other I/O channels that are available.

11.9.1 Permanent handles

There are three permanent system handles:

0 console
1 stdout
2 stderr

The basic routines to read from and write to the interactive console are exposed as handles 0. Applying 0 to a q expression evaluates it in the main thread of execution and displays the result to the console, resetting the user q) prompt.

q)0 "6*7"
42
q)
q)0 (*;6;7)
42
q)

Here is how to write a prompt to stdout without resetting the q) user prompt of the console, and then capture the entered response.

q)ans:{1 x;read0 0}"The answer to life, the universe and everything: "
The answer to life, the universe and everything: 42
q)ans
"42"

Here we write a message to stderr which is being captured separately using the \2 command.

q)\2 err.txt
2+`3 / err msg captured in stderr
2 "Vogons!" / write to stderr
2
\cat err.txt
"q)'type"
" [0] 2+`3"
"     ^"
"q)Vogons!q)"

You can also use handles -1 and -2 to write strings. See 11.4.2 for the specific behavior in this case.

11.9.2 Unix domain sockets

Unix domain sockets (UDS) are a mechanism for inter-process communication (IPC) on a single machine, enabling processes to exchange data. Unlike TCP/IP sockets, Unix domain sockets do not rely on network protocols, making them more efficient for local communication.

Note

Domain sockets are for communication between processes on the same server, not across a network. They can be much faster for low latency applications.

We shall provide a brief overview of the kdb+ capability summarized for the release notes. A detailed example is beyond the scope of this tutorial. See the KX Documentation website for implementation details.

Setting the listening port with cmd line -p port or via the q command prompt \p port also creates a UDS on /tmp/kx.port. Local clients may connect via

hopen `:unix://*port*[:*user*:*password*]

Proper Unix user permissions are necessary to access the socket file. To specifically enable UDS and configure its path (if needed), you can use the QUDSPATH environment variable before starting kdb+ in bash.

QUDSPATH=/tmp/my_kdb_socket q -p *port*

Linux can use a virtual path (represented by an initial @) instead of an actual file system path. See the KX Documentation website for implementation details.

11.9.3 Named pipes

Unix named pipes are another mechanism for processes to exchange data without the overhead of network communication protocols. A named pipe creates a file handle which allows both the sender and receiver processes to access the same pipe simultaneously. Unlike domain sockets, named pipes can be used by processes on the same or different machines.

We shall provide a brief overview of the kdb+ capability summarized for the release notes. A detailed example is beyond the scope of this tutorial. See the KX Documentation website for implementation details. Because of their behavior named pipes are also known as FIFOs.

Here we create a simple file of bytes and the demonstrate how to read it through a named pipe. First we create a file of 100,000 bytes and verify its contents in the traditional manner manner.

q)h:hopen `:data/bytes.dat
q)h 100000#0x06072a
6i
q)hclose h
q)read1 `:data/bytes.dat
0x06072a06072a06072a06072a06072a06072a06072a06072a06072a06072a06072a06..
q)count read1 `:data/bytes.dat
100000

Now we open the file as a named pipe and "stream" it into a buffer using the default chunk size of 64K. The first read fills the entire buffer. The second read partially fills the buffer. The third read is empty.

q)h:hopen`:fifo://data/bytes.dat
q)count 0N!read1 h
0x06072a06072a06072a06072a06072a06072a06072a06072a06072a06072a06072a..
65536
q)count 0N!read1 h
0x072a06072a06072a06072a06072a06072a06072a06072a06072a06072a06072a06..
34464
q)count 0N!read1 h
`byte$()
0
q)hclose h

This is useful, for example, if we want to read a large compressed file without decompressing it all into memory. In the next example we create a csv file, compress it and then stream it into memory through a named pipe.

First we created a csv with pseudo trade records.

q)5#read0 `:data/t.csv
"msft,12:00:00.000,b,o,50,50"
"appl,12:00:00.000,a,o,60,60"
"ibm,12:00:00.000,b,o,70,70"
"msft,12:00:00.001,b,o,50,50"
"appl,12:00:00.001,a,o,60,60"

And then we had the OS zip it to data/t.zip. Now we can "stream" the compressed file through a named pipe as follows without forcing the entire file to be decompressed into memory or on disk.

q)t:([]sym:();ti:();ba:();ex:();qty:();px:())
q)system"rm -f fifo && mkfifo fifo"
q)system"unzip -p data/t.zip > fifo &"
q).Q.fps[{`t insert ("STCCJF";",")0:x}]`:fifo
q)5#t
sym ti           ba ex qty px
-----------------------------
msft 12:00:00.000  b o 50 50

appl 12:00:00.000  a o 60 60

ibm  12:00:00.000  b o 70 70

msft 12:00:00.001  b o 50 50

appl 12:00:00.001  a o 60 60

See the KX site for more information on using named pipes.

11.10 Basic HTTP and WebSockets

We present the basic capabilities. See "WebSockets" on the KX Documentation website for more details.

11.10.1 Http Connections

When you open a port in a q session, by default that session serves http requests. To demonstrate this, start a q session and open a port, say 5042. Then bring up a relatively recent browser on the same machine (the author uses Chrome) and enter the following url,

http://localhost:5042/?6*7

You should see 42 in the browser page display.

You can trap HTTP GET and POST traffic by assigning functions to the system variables .z.ph and .z.pp respectively. The default handler for .z.ph is to evaluate the content of the first item of the passed argument.

Tip

There is no default handler for .z.pp.

Here is a simple example that duplicates the default GET processing and shows the two items of its list argument. Define the following handler on the server process. It stashes the default handler and replaces it with one that displays the two items of the input list and then passes the input message to the original handler.

q)zph:[.]{.underline}[z.ph] / server
q).z.ph:{show x 0; show x 1; zph x}

Now enter the following from a browser on the same machine.

http://localhost:5042/?6*7

On the server you should see something like the following and the client will display 42.

q)"?6*7"

Host | "localhost:5042"
Connection | "keep-alive"
Cache-Control | "max-age=0"
Accept |
"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"
User-Agent | "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.89
Safari/537.36"
Accept-Encoding| "gzip, deflate, sdch"
Accept-Language| "en-US,en;q=0.8"

Rather than trap .z.ph directly, since q3.6 we can attach our handler to .h.val, which is invoked to process of the actual content of the received HTTP message. In this way we can let q handle the wrapping and unwrapping of the content and response. We restore the default handler to .z.ph and then attach our processing to .h.val. Now enter the same url on client and see 6*7 on the server and see 42 on the client.

q).z.ph:zph / server
q).h.val:{value 0N!x}
q)"6*7"

11.10.2 Basic Web Sockets

WebSockets is a network protocol that upgrades an initial HTTP handshake into a TCP/IP socket connection. It was initially used to enhance communication capability between browsers and web servers but it can be used for general client-server applications.

Once the WebSocket connection is established, the client or server can each message the other. In particular, this provides the capability for the server to push data to the client.

Important

As of this writing (July 2025) q implements only asynchronous messaging in Web Sockets.

In this section we show the basic mechanism for establishing a WebSockets connection between a browser and a q process acting as the server. We have used Chrome while developing the examples but other versions of major browsers should work as well.

Note

In the examples of this section we assume basic familiarity with HTML5 and Javascript.

We begin with an extremely simple html page with a button that, when clicked, displays the answer to life, the universe and everything. Save the following as a text file sample0.html in a location accessible to your browser.

<!doctype html>
<html>
<head>
<script>
          function sayN(n) {
                   document.getElementById('answer').textContent = n;
          }
</script>
</head>
<body>
 <h1 style='font-size:200px' id='answer'></h1>
 <button onclick='sayN(42)'>get the answer</button>
</body>
</html>

In our case we saved the file to

/users/jaborror/Q4M/pages/sample0.html

on the local drive, so we enter the following url in the browser:

file:///users/jaborror/Q4M/pages/sample0.html

You should see a page with a single button labeled "get the answer". Click the button and you will see the answer in a very large font.

Now we enhance this basic page to connect to a q process via WebSockets and retrieve the answer from q. Save the following script as sample1.html. We explain it below.

Note

For simplicity in the example, we have placed a copy of c.js in the pages directory. The current version can be downloaded from the KX github. You should modify this to reflect its location in your installation.

<!doctype html>
<html>
<head>
  c.js</script>
  <script>
    var serverurl = "//localhost:5042/",
        c = connect(),
        ws;

    function connect() {
      if ("WebSocket" in window) {
        ws = new WebSocket("ws:" + serverurl);
        ws.binaryType = "arraybuffer";
        ws.onopen = function(e) {
          ws.send(serialize({ payload: "What is the meaning of life?" }));
        };
        ws.onclose = function(e) {};
        ws.onmessage = function(e) {
          sayN(deserialize(e.data));
        };
        ws.onerror = function(e) {
          window.alert("WS Error");
        };
      } else {
        alert("WebSockets not supported on your browser.");
      }
    }

    function sayN(n) {
      document.getElementById('answer').textContent = n;
    }
  </script>
</head>
<body>
  <h1 style='font-size:200px' id='answer'></h1>
</body>
</html>

This script first declares the script c.js, which is no longer required for using q WebSockets provided you are happy with a text-only format e.g. JSON.

The script then defines Javascript variables

  • serverurl to hold the url of our q service
  • c to hold the connection object in case you need it (we don't)
  • ws to hold a WebSockets object.

The function connect() is where the WebSockets action happens.

  • It first tests to see if "WebSocket" is in the window, meaning that the browser supports WebSockets. If so, it makes the connection to the server; otherwise it displays an error alert.
  • The first step in the connection is to create a WebSocket object by connecting to the specified server url, and storing the result in ws.
  • Then set the binaryType field in ws to the value needed by the q sockets code.

Now we assign handlers for the main WebSockets events.

  • The open handler serializes (into q form) a Javascript object with a payload field and then sends it to the server. Consequently when a connection is opened, we immediately ask the server the meaning of life.
  • The close handler is empty.
  • The message handler deserializes the data field of the parameter e and applies the sayN function to display the result on the page.
  • The error handler displays an alert page with the error message.

The sayN function locates the answer field on the page and places the text of its argument there. Finally, the script defines a simple html element answer.

In contrast, the server side q code is blissfully short. Start a fresh q session, open port 5042 and set the WebSockets handler .z.ws to a function that will be invoked to handle WebSockets messages.

q)\p 5042
q).z.ws:{0N!-9!x; neg[.z.w] -8!42}

Now point the browser to,

file:///users/jaborror/Q4M/pages/sample1.html

and you will see the answer displayed on the page.

11.10.3 Pushing Data to the Browser

In ordinary web applications, the browser initiates interaction with the server. It sends an HTTP request to a specific url on the server and the server replies with the requested page or data. Each such interaction is self-contained and is synchronous in that the browser waits for the server response.

With WebSockets the client (browser in our case) initiates the connection, but once the WebSockets request for protocol upgrade is successful, the client and the server are on equal footing. Either side can send messages. Moreover, in the current q implementation of WebSockets all interaction is asynchronous. Given that most current browsers and the default q session are both single-threaded, you don't have to worry about races and deadlocks but you do have to set up callbacks.

In this section we demonstrate how the q server can push data to the browser, beginning with the browser script. The key point is that the onmessage handler will be called every time data is received, resulting in that data being displayed on the screen. We use the same sample1.js file as in the previous example to open the WebSockets connection and display the message(s) from the server.

Now we revise the server side code to push messages. Enter the following in the console of a fresh q session.

q)\p 5042
q)answer:42
q).z.ws:{`requestor set .z.w; system "t 1000"; neg[requestor] -8!answer;}
q).z.ts:{neg[requestor] -8!answer+:1;}

Here is what's happening in the q code.

  • First we open the port and initialize the answer variable.
  • In the WebSockets handler we stash the client .z.w value into the global requestor and start the system timer firing every 1000 milliseconds (see 13.1.10). Note that this only happens after the browser initiates a connection. Finally we send the initial answer to the client
  • Finally, we set the timer handler to send an asynchronous message containing the serialized value of answer after incrementing it.

Now point the browser once again to

file:///users/jaborror/Q4M/pages/sample1.html

and you will see the answer ticking every second on the page.

11.11 HTTP and JSON

We have only scratched the surface of what can be done with web apps in q. You could, for example, extend our simple push demo to push realtime trades/quotes to the browser. There is an example that does this on the KX Documentation website. We recommend that you review it or better yet implement it yourself.

11.11.1 .h

There is a rich ecosystem around HTML and WebSockets on the KX Documentation website. First, if you are so inclined, the namespace .h contains many helper routines if you want to build or modify web pages in q. Even if you are not convinced, we urge you to browse there if you are doing web front ends for q apps.

The .h namespace contains routines for

  • marking up strings as HTML
  • converting data between various formats
  • composing HTTP responses
  • web-console display

Here is the summary from the KX Documentation website.

11.11.2 .j

The .j namespace contains routines to facilitate the exchange of data between q and HTTP by interconverting q and JSON dictionaries. The JSON parsing in q is is robust, meets standards of of course is fast.

The function .j.j converts a q entity to a JSON string. Here are some examples. The escaping of quotes makes things a bit hard to read.

q).j.j 42
"42"
q).j.j 1 2 3
"[1,2,3]"
q).j.j 1 2 3!`a`b`c
"{\"1\":\"a\",\"2\":\"b\",\"3\":\"c\"}"
q).j.j ([]c1:`a`b`c; c2:10 20 30)
"[{\"c1\":\"a\",\"c2\":10},{\"c1\":\"b\",\"c2\":20},{\"c1\":\"c\",\"c2\":30}]"

Conversely q can convert from from a JSON string to a q dictionary.

Note

If the JSON string has multiple lines apply raze first to flatten it.

Here we have the simple JSON object

\% cat life.json
{
    "code" : 42,
    "message" : "Life, the universe and everything"
}

We read it as text, flatten and covert to a q dictionary.

q)read0 `:scratch/life.json
,"{"
" \"code\" : 42,"
" \"message\" : \"Life, the universe and everything\""
,"}"
q).j.k raze read0 `:scratch/life.json
code | 42f
message| "Life, the universe and everything"

Note

Converting from q to JSON and back will often not preserve q data types. In the first example, the literals 0 and 1 come back as floats after the round trip. In the second, symbols reincarnate as strings.

q).j.k 0N!.j.j `a`b!(0 1;("hello";"world"))
"{\"a\":[0,1],\"b\":[\"hello\",\"world\"]}"
a| 0 1
b| "hello" "world"
q)d:0N!.j.k 0N!.j.j `a`b!(0 1;("hello";"world"))
"{\"a\":[0,1],\"b\":[\"hello\",\"world\"]}"
`a`b!(0 1f;("hello";"world"))
q)type d[`a] 
9h 

q).j.k 0N!.j.j ([]a:42 43;b:`Life`After)  
"[{\"a\":42,\"b\":\"Life\"},{\"a\":43,\"b\":\"After\"}]" 
a  b       
---------- 
42 "Life"  
43 "After" 

Since JSON doesn't know about q infinities, you need to take care when sending them to JSON. The non-float infinities are rendered as their underlying integer values by .j.j.

q).j.j -0W 0 0W
"[-9223372036854775807,0,9223372036854775807]"

The float infinities are rendered as "inf" by default.

q).j.j -0w 0 0w
"[-inf,0,inf]"

You can override this by using .j.jd, which takes an extra parameter that should be the singleton dictionary ([null0w:1b]) which will cause float infinities to be rendered as "null".

q).j.jd(-0w 0 0w; ([null0w:1b]))
"[null,0,null]"

11.12 TLS and SSL

The q/kdb+ environment provides support for SSL and its replacement TLS to encrypt connections using the OpenSSL libraries. This subject is beyond the scope of this tutorial and we recommend that you refer to the topic "SSL/TLS" on the KX Documentation website for full details of their implementation. We post the following summary from that entry.

Suitability and restrictions

Currently we would recommend TLS be considered only for long-standing latency-insensitive, low-throughput connections. The overhead of hopen on localhost appears to be 40-50× that of a plain connection, and once handshaking is complete, the overhead is ~1.5× assuming your OpenSSL library can utilize AES-NI.