Syntax¶
It is a privilege to learn a language,
a journey into the immediate
– Marilyn Hacker, “Learning Distances”
Elements¶
The elements of q are
 functions: operators, keywords, lambdas, and extensions
 data structures: atoms, lists, dictionaries, tables, expression lists, and parse trees
 attributes of data structures
 control words
 scripts
 environment variables
Applicable values
Lists, dictionaries, file and process handles, and functions of all kinds are all applicable values. An applicable value is a mapping.
A function maps its domains to its range. A list maps its indexes to its items. A dictionary maps its keys to its values.
Tokens¶
All the ASCII symbols have syntactic significance in q. Some denote functions, that is, actions to be taken; some denote nouns, which are acted on by functions; some denote iterators, which modify nouns and functions to produce new functions; some are grouped to form names and constants; and others are punctuation that bound and separate expressions and expression groups.
The term token is used to mean one or more characters that form a syntactic unit. For instance, the tokens in the expression 10.86 +/ LIST
are the constant 10.86
, the name LIST
, and the symbols +
and /
. The only tokens that can have more than one character are constants and names and the following.
<= / lessthanorequal
>= / greaterthanorequal
<> / notequal
:: / null, view, set global
/: / eachright
\: / eachleft
': / eachprior, eachparallel
When it is necessary to refer to the token to the left or right of another unit, terms like “immediately to the left” and “followed immediately by” mean that there are no spaces between the two tokens.
Nouns¶
All data are syntactically nouns. Data include
 atomic values
 collections of atomic values in lists
 lists of lists, and so on
 Atomic values

include character, integer, floatingpoint, and temporal values, as well as symbols, functions, dictionaries, and a special atom
::
, called null. All functions are atomic data.  List constants

include several forms for the empty list denoting the empty integer list, empty symbol list, and so on. (Oneitem lists are displayed using the comma to distinguish them from atoms, as in
,2
the oneitem list consisting of the single integer item 2.)  Numerical constants

(integer and floatingpoint) are denoted in the usual ways, with both decimal and exponential notation for floatingpoint numbers. A negative numerical constant is denoted by a minus sign immediately to the left of a positive numerical constant. Special atoms for numerical and temporal datatypes (e.g.
0W
and0N
) refer to infinities and “notanumber” (or “null” in database parlance) concepts.  Temporal constants

include timestamps, months, dates, datetimes, timespans, minutes, and seconds.
2017.01 / month
2017.01.18 / date
00:00:00.000000000 / timespan 00:00 / minute
00:00:00 / second
00:00:00.000 / time
 Character constants

An atomic character constant is denoted by a single character between double quote marks, as in
"a"
; more than one such character, or none, between double quotes denotes a list of characters.  Symbol constants

A symbol constant is denoted by a backquote to the left of a string of characters that form a valid name, as in
`a.b_2
. The string of characters can be empty; that is, backquote alone is a valid symbol constant. A symbol constant can also be formed for a string of characters that does not form a valid name by including the string in doublequotes with a backquote immediately to the left, as in`"ab!"
.  Dictionaries

are created from lists of a special form.
 Tables

A table is a list of dictionaries, all of which have the same keys. These keys comprise the names of the table columns.
 Functions

can be denoted in several ways; see below. Any notation for a function without its arguments denotes a constant function atom, such as
+
for the Add operator.
List notation¶
A sequence of expressions separated by semicolons and surrounded by left and right parentheses denotes a noun called a list. The expression for the list is called a list expression, and this manner of denoting a list is called list notation. For example:
(3 + 4; a _ b; 20.45)
denotes a list. The empty list is denoted by ()
, but otherwise at least one semicolon is required. When parentheses enclose only one expression they have the common mathematical meaning of bounding a subexpression within another expression.
For example, in
(a * b) + c
the product a * b
is formed first and its result is added to c
; the expression (a * b)
is not list notation.
An atom is not a oneitem list.
Oneitem lists are formed with the enlist
function, as in enlist"a"
and enlist 3.1416
.
q)3 /atom
3
q)enlist 3 / 1item list
,3
Vector notation¶
Lists in which all the items have the same datatype play an important role in kdb+. Q gives vector constants a special notation, which varies by datatype.
01110001b / boolean
"abcdefg" / character
`ibm`aapl`msft / symbol
Numeric and temporal vectors separate items with spaces and if necessary declare their type with a suffixed lowercase character.
2018.05 2018.07 2019.01m / month
2 3 4 5 6 / integer
2 3 4 5 6h / short integer
2 3 4 5 6i / xxxxx integer
2 3 4 5 6j / xxxxx integer
2 3 4 5.6 / float
2 3 4 5 6f / float
type  example 

numeric  42 43 44 
date  2012.09.15 2012.07.05 
char  "abc" 
boolean  0101b 
symbol  `ibm`att`ora 
Strings¶
Char vectors are also known as strings.
When \
is used inside character or string displays, it serves as an escape character.
\" 
double quote 
\NNN 
character with octal value NNN (3 digits) 
\\ 
backslash 
\n 
new line 
\r 
carriage return 
\t 
horizontal tab 
Table notation¶
A table can be written as a list: an expression list followed by one or more expressions.
An empty expression list indicates a simple table.
q)([]sym:`aapl`msft`goog;price:100 200 300)
sym price

aapl 100
msft 200
goog 300
The names assigned become the column names. The values assigned must conform: be lists of the same count, or atoms. The empty brackets indicate that the table is simple: it has no key.
The initial expression list can declare one or more columns as a key. The values of the key column/s of a table should be unique.
q)([names:`bob`carol`bob`alice;city:`NYC`CHI`SFO`SFO]; ages:42 39 51 44)
names city ages
 
bob NYC  42
carol CHI  39
bob SFO  51
alice SFO  44
Attributes¶
Attributes are metadata that apply to lists of special form. They are often used on a dictionary domain or a table column to reduce storage requirements or to speed retrieval.
Bracket notation¶
A sequence of expressions separated by semicolons and surrounded by left and right brackets ([
and ]
) denotes either the indexes of a list or the arguments of a function. The expression for the set of indexes or arguments is called an index expression or argument expression, and this manner of denoting a set of indexes or arguments is called bracket notation.
For example, m[0;0]
selects the element in the upper left corner of a matrix m
, and f[a;b;c]
evaluates the function f
with the three arguments a
, b
, and c
.
Unlike list notation, bracket notation does not require at least one semicolon; one expression between brackets – or none – will do.
Operators can also be evaluated with bracket notation. For example, +[a;b]
means the same as a + b
. All operators can be used infix.
Bracket pairs with nothing between them also have meaning; m[]
selects all items of a list m
and f[]
evaluates the noargument function f
.
Tip
The similarity of index and argument notation is not accidental.
Conditional evaluation and control statements¶
A sequence of expressions separated by semicolons and surrounded by left and right brackets ([
and ]
), where the left bracket is preceded immediately by a $
, denotes conditional evaluation.
If the word do
, if
, or while
appears instead of the $
then that word together with the sequence of expressions denotes a control statement.
The first line below shows conditional evaluation; the next three show control statements:
$[a;b;c]
do[a;b;c]
if[a;b;c]
while[a;b;c]
Control words are not functions and do not return results.
Function notation¶
A sequence of expressions separated by semicolons and surrounded by left and right braces ({
and }
) denotes a function. The expression for the function definition is called a function expression or lambda, and this manner of defining a function is called function or lambda notation.
The first expression in a function expression can be a signature: an argument expression of the form [name1;name2;…;nameN]
naming the arguments of the function. Like bracket notation, function notation does not require at least one semicolon; one expression (or none) between braces will do.
Within a script, a function may be defined across multiple lines.
Prefix, infix, postfix¶
There are various ways to apply a function to its argument/s.
f[x] / bracket notation
f x / prefix
x + y / infix
f\ / postfix
In the last example above, the iterator \
is applied postfix to the function f
, which appears immediately to the left of the iterator.
Iterators are the only functions that can be applied postfix.
Bracket and prefix notation are also used to apply a list to its indexes.
q)"abcdef" 1 0 3
"bad"
Postfix yields infix¶
An iterator applied to an applicable value derives a function. For example, Scan applied to Add derives the function Add Scan: +\
.
If the iterator is applied postfix, as it almost always is, the derived function has infix syntax.
Holds for all ranks
This rule holds regardless of the rank of the derived function.
For example, count'
is unary but has infix syntax.
A common consequence is that many derived functions must be parenthesized to be applied postfix. (See below.)
Prefix and vector notation¶
Index and argument notation (i.e. bracket notation) are similar.
Prefix expressions evaluate unary functions as in til 3
. This form of evaluation is permitted for any unary.
q){x  2} 5 3
3 1
This form can also be used for item selection.
q)(1; "a"; 3.5; `xyz) 2
3.5
Juxtaposition is also used in vector notation.
3.4 57 1.2e20
The items in vector notation bind more tightly than the tokens in function call and item selection. For example, {x  2} 5 6
is the function {x  2}
applied to the vector 5 6
, not the function {x  2}
applied to 5, followed by 6.
Parentheses around a function with infix syntax¶
Parentheses around a function with infix syntax capture it as a value and prevent it being parsed as an infix.
Add Scan +\
is variadic and has infix syntax.
q)+\[1 2 3 4 5] / unary
1 2 6 24 120
q)+\[1000;1 2 3 4 5] / binary
10001 10002 10006 10024 10120
q)1000+\1 2 3 4 5 / binary, applied infix
10001 10002 10006 10024 10120
Captured as a value by parentheses, it remains variadic, but can be applied postfix as a unary.
q)(+\)[1000;1 2 3 4 5] / binary
10001 10002 10006 10024 10120
q)(+\)1 2 3 4 5 / unary, applied postfix
1 2 6 24 120
Captured as a value, a function with infix syntax can be passed as an argument to another function.
q)(+) scan 1 2 3 4 5 / + is binary and infix
1 2 6 24 120
q)n:("the ";("quick ";"brown ";("fox ";"jumps ";"over ");"the ");("lazy ";"dog."))
q)(,/) over n / ,/ is variadic and infix
"the quick brown fox jumps over the lazy dog."
The parentheses above are not necessary for functions without infix syntax.
q)raze over n
"the quick brown fox jumps over the lazy dog."
q){,/[x]}over n
"the quick brown fox jumps over the lazy dog."
Compound expressions¶
Function expressions, index expressions, argument expressions and list expressions are collectively referred to as compound expressions.
Empty expressions¶
An empty expression occurs in a compound expression wherever the place of an individual expression is either empty or all blanks. For example, the second and fourth expressions in the list expression (a+b;;cd;)
are empty expressions. Empty expressions in both list expressions and function expressions actually represent a special atomic value called null.
Colon¶
The colon has several uses. The principal use is denoting assignment. It can appear with a name to its left and a noun to its right, or a name followed by an index expression to its left and a noun to its right, as in x:y
and x[i]:y
. The former assigns the value of y
to x
; the latter to x
at indexes i
.
The colon can also have a primitive operator immediately to its left, with a name (or name and index expression) to the left of that, as in x+:y
and x[i],:y
. This is known as assignment through the operator. For operator f
, the expressions x f:y
and x:x f y
are equivalent.
A pair of colons with a name to its left and an expression on the right
 within a function expression, denotes global assignment, that is, assignment to a global name (
{… ; x::3 ; …}
)  outside a function expression, defines a view
The functions associated with I/O and interprocess communication are denoted by a colon following a digit, as in 0:
and 1:
.
A colon used as a unary in a function expression, as in :r
, means return from the function with the result r
.
The q operators are all binary functions.
They inherit unary forms from k, denoted by a colon suffix, e.g. (#:
).
Use of these forms in q programs is deprecated.
Iterators¶
Iterators are higherorder operators. Their arguments are applicable values (functions, process handles, lists, and dictionaries) and their results are derived functions that iterate the application of the value.
Three symbols, and three symbol pairs, denote iterators:
token  semantics 

' 
Case and Each 
': 
Each Prior, Each Parallel 
/: and \: 
Each Right and Each Left 
/ and \ 
Converge, Do, While, Reduce 
Any of these in combination with the value immediately to its left, derives a new function.
The derived function is a variant of the value modified by the iterator.
For example, +
is Add and +/
is sum.
q)(+/)1 2 3 4 / sum the list 1 2 3 4
10
q)16 + 1 2 3 4 / sum the list with starting value 16
26
Any notation for a derived function without its arguments (e.g. +/
) denotes a constant function atom.
Application for how to apply iterators
Names and namespaces¶
Names consist of the upper and lowercase alphabetic characters, the numeric characters, dot (.
) and underscore (_
). The first character in a name cannot be numeric or the underscore.
Underscores in names
While q permits the use of underscores in names, this usage is strongly deprecated because it is easily confused with Drop.
q)foo_bar:42
q)foo:3
q)bar:til 6
Is foo_bar
now 42
or 3 4 5
?
A name is unique in its namespace. A kdb+ session has a default namespace, and child namespaces, nested arbitrarily deep. This hierarchy is known as the Ktree. Namespaces are identified by a leading dot in their names.
Kdb+ includes namespaces .h
, .j
, .q
, .Q
, and .z
.
(All namespaces with onecharacter names are reserved for use by Kx.)
Names with dots are compound names, and the segments between dots are simple names. All simple names in a compound name have meaning relative to the Ktree, and the dots denote the Ktree relationships among them. Two dots cannot occur together in a name. Compound names beginning with a dot are called absolute names, and all others are relative names.
Iterator composition¶
A derived function is composed by any string of iterators with an applicable value to the left and no spaces between any of the iterator glyphs or between the value and the leftmost iterator glyph. For example, +\/:\:
composes a wellformed function. The meaning of such a sequence of symbols is understood from left to right. The leftmost iterator (\
) modifies the operator (+
) to create a new function. The next iterator to the right of that one (/:
) modifies the new function to create another new function, and so on, all the way to the iterator at the right end.
Projecting the left argument of an operator¶
If the left argument of an operator is present but the right argument is not, the argument and operator symbol together denote a projection. For example, 3 +
denotes the unary function “3 plus”, which in the expression (3 +) 4
is applied to 4 to give 7.
Precedence and order of evaluation¶
All functions in expressions have the same precedence, and with the exception of certain compound expressions the order of evaluation is strictly right to left.
a * b +c
is a*(b+c)
, not (a*b)+c
.
This rule applies to each expression within a compound expression and, other than the exceptions noted below, to the set of expressions as well. That is, the rightmost expression is evaluated first, then the one to its left, and so on to the leftmost one.
For example, in the following pair of expressions, the first one assigns the value 10 to x
. In the second one, the rightmost expression uses the value of x
assigned above; the center expression assigns the value 20 to x
, and that value is used in the leftmost expression:
q)x: 10
q)(x + 5; x: 20; x  5)
25 20 5
The sets of expressions in index expressions and argument expressions are also evaluated from right to left. However, in function expressions, conditional evaluations, and control statements the sets of expressions are evaluated left to right.
q)f:{a : 10; : x + a; a : 20}
q)f[5]
15
The reason for this order of evaluation is that the function f
written on one line above is identical to:
f:{
a : 10;
:x+ a;
a : 20 }
It would be neither intuitive nor suitable behavior to have functions executed from the bottom up. (Note that in the context of function expressions, unary colon is Return.)
Multiline expressions¶
Individual expressions can occupy more than one line in a script. Expressions can be broken after the semicolons that separate the individual expressions within compound expressions; it is necessary only to indent the continuation with one or more spaces. For example:
(a + b;
;
c  d)
is the 3item list (a+b;;cd)
.
Note that whenever a set of expressions is evaluated left to right, such as those in a function expression, if those expressions occupy more than one line then the lines are evaluated from top to bottom.
Spaces¶
Any number of spaces are usually permitted between tokens in expressions, and usually the spaces are not required. The exceptions are:
 No spaces are permitted between the symbols
'
and:
when denoting the iterator':
\
and:
when denoting the iterator\:
/
and:
when denoting the iterator/:
 a digit and
:
when denoting a function such as0:
:
and:
for assignments of the formname :: value
 No spaces are permitted between an iterator glyph and the value or iterator symbol to its left.
 No spaces are permitted between an operator glyph and a colon to its right whose purpose is to denote assignment.
 If a
/
is meant to denote the left end of a comment then it must be preceded by a blank (or newline); otherwise it will be taken to be part of an iterator.  Both the underscore character (
_
) and dot character (.
) denote operators and can also be part of a name. The default choice is part of a name. A space is therefore required between an underscore or dot and a name to its left or right when denoting a function.  At least one space is required between neighboring numeric constants in vector notation.
 A minus sign (

) denotes both an operator and part of the format of negative constants. A minus sign is part of a negative constant if it is next to a positive constant and there are no spaces between, except that a minus sign is always considered to be the function if the token to the left is a name, a constant, a right parenthesis or a right bracket, and there is no space between that token and the minus sign. The following examples illustrate the various cases:
x1 / x minus 1
x 1 / x applied to 1
3.51 / 3.5 minus 1
3.5 1 / numeric list with two elements
x[1]1 / x[1] minus 1
(a+b) 1 / (a+b) minus 1
Comments¶
Line, trailing and multiline comments are ignored by the interpreter
q)/Oh what a lovely day
q)2+2 /I know this one
4
unless embedded within a string or preceded by a system command.
q)count"2/3"
3
q)\l /data/files
As first and only nonwhitespace char on a line:
/
starts a multiline comment\
terminates a multiline comment or, if not in a comment, comments to end of script
/
Oh what a beautiful morning
Oh what a wonderful day
\
a:42
\
ignore this and what follows
the restroom at the end of the universe
Special constructs¶
Backslash, colon and singlequote (/ \ : '
) all have special meanings outside ordinary expressions, denoting system commands and debugging controls.