Skip to content

Datatypes in kdb+

Different types of data have different representations in q, corresponding to different internal representations in kdb+. This is of particular importance in the representation of vectors: Lists of atoms of the same type are called vectors (sometimes simple or homogenous lists) and have representations that vary by type.

Every object in q has a datatype, reported by the type keyword.

q)type 42 43 44     / vector of longs
7h
q)type (+)          / even an operator has a type
102h

Datatypes for a complete table.

Numbers

type       atom    vector       null   inf
------------------------------------------
short      42h     42 43 44h    0Nh    0Wh
int        42i     42 43 44i    0Ni    0Wi
long       42j     42 43 44j    0Nj    0Wj
           42      42 43 44     0N     0W
real       42e     42 43 44e    0Ne    0We
float      42f     42 43 44f    0n     0w
           42.     42 43 44.

The default integer type is long, so the j suffix can be omitted. A decimal point in a number is sufficient to denote a float, so is an alternative to the f suffix.

Nulls and infinities are typed as shown.

Text

Text data is represented either as char vectors or as symbols.

"a"                         / char atom
"quick brown fox"           / char vector
("quick";"brown";"fox")     / list of char vectors

Char vectors can be indexed and are mutable, but are known in q as strings.

q)s:"quick"                 / string
q)s[2]                      / indexing
"i"
q)s[2]:"a"                  / mutable
q)s
"quack"

Symbols are atomic and immutable. They are suitable for representing recurring values.

`screw                      / symbol atom
`screw`nail`screw           / symbol vector
q)count `screw`nail`screw   / symbols are atomic
3

The null string is " " and the null symbol is a single backtick `.

Dates and times

type       atom         vector                      null  inf
-------------------------------------------------------------
month      2020.01m     2020.01 2019.08m            0Nm
date       2020.01.01   2020.01.01 2020.01.02       0Nd   0Wd

minute     12:34         12:34 12:46                0Nu   0Wu
second     12:34:56      12:34:56 12:46:30          0Nv   0Wv
time       12:34:56.789  12:34:56.789 12:46:30.500  0Nt   0Wt
type       atom                                     null  inf
-------------------------------------------------------------
timestamp  2020.02.29D12:11:42.381000000            0Np   0Wp
datetime   2020.02.29T12:14:42.718                  0Nz   0Wz
timespan   0D00:05:14.659000000                     0Nn   0Wn

Datetime is deprecated. Prefer the nanosecond precision of timestamps.

Booleans

Booleans have the most compact vector representation in q.

q)"Many hands make light work."="a"
010000100000100000000000000b

GUIDs

In general, there should be no need for char vectors for IDs. IDs should be int, sym or guid. Guids are faster (much faster for =) than the 16-byte char vectors and take 2.5 times less storage (16 per instead of 40 per).

Use Deal to generate unique guids.

q)-2?0Ng
cf74afa1-6c49-8e11-d599-736eba641207 6080b044-aa79-2d30-62a4-34390a4c81d1

Casting, Datatypes
Cast, Tok, null, type
Q for Mortals §2.4 Basic Data Types – Atoms