10. Execution Control¶

10.0 Overview¶

Basic function application in q provides sequential evaluation of a series of expressions. In this chapter we demonstrate how to achieve non-sequential execution in q.

10.1 Control Flow¶

When writing vector operations in q, the cleanest code and best performance is obtained by avoiding loops and conditional execution. For those times when you simply must write iffy or loopy code, q has versions of the usual constructs.

Warning

Control flow constructs in this section involve branching in the byte code generated by the q interpreter. The offset of the branch destination is limited (currently to 255 byte codes), which means that the sequence of q expressions that can be contained in any part of $, ?, if, do, or while must be short. At some point, insertion of one additional statement will break the camel’s back, resulting in a 'branch error. This is q's way of rejecting bloated code. In this situation, factor code blocks into separate functions. Better yet, restructure your code.

10.1.1 Basic Conditional Evaluation¶

Languages of C heritage have a form of in-line if, called conditional evaluation, of the form,

expr_cond ? expr_true : expr_false

where expr_cond is an expression that evaluates to a Boolean (or int in C and C++). The result of the expression is expr_true when expr_cond is true (or non-zero) and expr_false otherwise.

The same effect can be achieved in q using the ternary overload of $.

$[expr_cond; expr_true; expr_false]

Here expr_cond is an expression that evaluates to a boolean atom. Analogous to C, the result of expr_cond can be any type whose underlying value is an integer. The result of the conditional is the evaluation of expr_true when expr_cond is not zero and expr_false if it is zero.

q)$[1b;42;9*6]
42
q)$[0b;42;9*6]
_

Tip

The conditional $ is a function that always returns a value. It is good practice for expr_true and expr_false to have the same type. This is not enforced in q as it is in statically-typed functional languages.

The brackets in any q conditional do not create lexical scope. This means that variables created within the body exist in the same scope as the conditional. For example, in a fresh q session the variable a in the following is a global that persists outside the conditional.

q)a
'a
q)$[1b;a:42;a:43]
42
q)a
42

Although evaluation of function arguments in q is eager, evaluation of the expressions in the conditional is short circuited, meaning that only the one selected for return is evaluated. Again in a fresh q session,

q)a
'a
q)b
'b
q)$[1b;a:42;b:43]
42
q)a
42
q)b
'b

Observe that a test for zero in expr_cond is redundant: remove that test and reverse the order of the second and third arguments.

q)z:0
q)$[z=0;1.1;-1.1]
1.1
q)$[z;-1.1;1.1] / equivalent to previous
1.1

In contrast with earlier versions of q, some null values are now acceptable for expr_cond. It is the same as testing for null with the keyword null.

q)v:0N
q)$[v;`isnull;`notnull]
`isnull
q)$[null v;`isnull;`notnull]
_

Tip

Float nulls do not work so the above is probably an accident and you should not count on it.

10.1.2 Extended Conditional Evaluation¶

In languages of C heritage, the if-else statement has the form,

if (expr_cond) {
    statement_true1;
    .
    .
    .
}

else {
    statement_false1;
    .
    .
    .
}

where expr_cond is an expression that evaluates to a boolean (or int in C and C++). If the expression expr_cond is true (i.e., non-zero) the first sequence of statements in braces is executed; otherwise, the second sequence of statements in braces is executed.

A similar effect can be achieved in q using an extended form of conditional evaluation with $.

$[expr_cond; [expr_true1; …]; [expr_false1; …]]

where expr_cond is an expression as in the basic conditional. When expr_cond evaluates to non-zero, the first bracketed sequence of expressions is evaluated in left-to-right order; otherwise, the second bracketed sequence of expressions is evaluated.

q)v:42
q)$[v=42; [a:6;b:7;`Everything]; [a:`Life;b:`the;c:`Universe;a,b,c]]
`Everything
q)$[v=43; [a:6;b:7;`everything]; [a:`Life;b:`the;c:`Universe;a,b,c]]
`Life`the`Universe

The extended forms of the conditional are still functions with return values. If you are using them strictly for side effects, you are writing imperative, non-vector code and should consider VBA as an alternative.

Languages of C heritage have a cascading form of if-else in which multiple tests can be made,

if (expr_cond1) {
    statement_true11;
    .
    .
    .
}
else if (expr_condn) {
    statement_truen1;
    .
    .
    .
}
.
.
.
else {
    statement_false;
    .
    .
    .
}

In this construction, the expr_condn are evaluated consecutively until one is true (non-zero), at which point the associated block of statements is executed and the statement is complete. If none of the expressions passes, the final block of statements, called the default case, is executed.

A similar effect can be achieved in q with another extended form of conditional execution.

$[expr_cond1;expr_true1; …;expr_condn;expr_truen;expr_false]

In this form, the conditional expressions are evaluated consecutively until one is non-zero, at which point the associated expr_true is evaluated and its result is returned. If none of the conditional expressions evaluates to non-zero, expr_false is evaluated and its result is returned. Observe that expr_false is the only expression that is not part of a pair, as it has no guarding conditional expression.

Tip

Any condition other than the first is only evaluated if all those prior to it have evaluated to zero. Otherwise put, a condition evaluating to non-zero short-circuits the evaluation of subsequent ones.


q)a:0
q)$[a=0;`zero; a>0;`pos; `neg]
`zero
q)a:42
q)$[a=0;`zero; a>0;`pos; `neg]
_
q)a:-42
q)$[a=0;`zero; a>0;`pos; `neg]
_

Finally, the previous extended form of conditional execution can be further extended by substituting a bracketed sequence of expressions for any expr_true or expr_false.

$[expr_cond1;[expr_true11; …]; …;
    expr_condn;[expr_truen1; …];
    [expr_false1; …]]

If you use this, have the decency to align your code properly so that q coders can identify it as bogus q at a glance.

After a brief Zen meditation, you realize that you can implement “switch” with a dictionary.

10.1.3 Vector Conditional Evaluation¶

Ternary vector-conditional evaluation ? has the form,

?[v_b;expr_true;expr_false]

where v_b is a simple boolean list and expr_true and expr_false are of the same type and are either atoms or vectors that conform to v_b. The result conforms to v_b and selects from expr_true in positions where v_b is 1b and expr_false in positions where v_b has 0b. All arguments of vector-conditional are fully executed. In other words, there is no short circuiting of evaluation.

The following example chooses 42 for items in a list that are multiples of 3.

q)L:til 10
q)?[0<>L mod 3; L; 42]
42 1 2 42 4 5 42 7 8 42

Vector Conditional is especially useful with table columns.

q)t:([] c1:1.1 2.2 3.3; c2:10 20 30; c3:100 200 300)
q)update mix:?[c1>2.0; c3; c2] from t
_

There are no extended forms of Vector Conditional. You can get a cascading effect by nesting vector conditional expressions.

With t as above,

q)update band:?[c2 within 5 15; 1; ?[c2 within 16 25; 2; 3]] from t
_

10.1.4 `if`¶

The imperative if statement conditionally evaluates a sequence of expressions. It is not a function and does not return a value. It has the form,

if[expr_cond;expr₁; …;expr_n]

The expr_cond is evaluated and if it is non-zero the expressions expr₁ thru expr_n are evaluated in left-to-right order. As with other conditionals, the brackets do not create lexical scope, so variables defined in the body exist in the same scope as the if.

There is no “else” to go with if. Should you find that this cramps your coding style, please see the previous recommendation about VBA.

Here is an example that creates two global variables and modifies one.

q)a:42
q)b:98.6
q)if[a=42;x:6;y:7;b:a*b]
q)x
6
q)y
_
q)b
_

Well-written q code rarely needs if. One example of legitimate use is pre-checking function arguments to abort execution for bad values.

10.1.5 `do`¶

The imperative do statement allows repeated execution of a block of statements. It has the form,

do[expr_count;expr₁; …;expr_n]

where expr_count must evaluate to an non-negative integer. The expressions expr₁ thru expr_n are evaluated expr_count times in left-to-right order. Note that do is a statement, not a function, and does not have an explicit result.

The following expression is a loopy computation of n factorial. It iterates n - 1 times, decrementing the factor f on each pass.

q)n:5
q)do[-1+f:r:n; r*:f-:1] / do not do this!
q)r

The best recommendation about usage of do is: Don’t!

The only legitimate use of do that the author has encountered is to time the execution of a q expression that runs too quickly for the timer to get an accurate reading, but this has been obviated by the enhanced \t command.

q)\t v*v:til 1000000
15
q)\t do[100; v*v:til 1000000]
677
q)\t:100 v*v:til 1000000
_

10.1.6 `while`¶

The imperative while statement is an iterator of the form,

while[expr_cond;expr₁; …;expr_n]

where expr_cond is evaluated and the expressions expr₁ thru expr_n are evaluated repeatedly in left-to-right order as long as expr_cond is non-zero. The while statement is not a function, does not have an explicit result and does not introduce lexical scope.

The author has never used while in actual code.

Here is loopy factorial redone with while.

q)f:r:n:5
q)while[f-:1;r*:f] / do not do this either!
q)r
120

10.1.7 Return and Signal¶

Normal function application evaluates each expression in the function body in sequence and terminates after the last one. There are two mechanisms for ending the execution early: one indicates successful completion and the other signals abrupt termination.

To terminate function application immediately and return a normal value, use an empty assignment – that is, : with the return value to its right and no variable to its left. For example, in the following instrumented function, application is terminated and the result is returned in the fourth expression. The final expression is never evaluated.

q)f:{0N!"Begin"; a:x; b:y; :a*b; "End"}
q)f[6;7]
"Begin"
42

To abort function execution immediately with an exception, use Signal, which is single-quote ', with an error message to its right. The error message can be provided as a symbol or string.

You too can return pithy error messages. For example, in the following function, execution will be aborted in the fourth expression. The final expression that assigns c is never evaluated.

q)g:{0N!"Begin"; a:x; b:y; '"End"; c:b}
q)g[6;7]
"Begin"
'End

A function issuing a signal causes the calling routine to fail and this will ripple all the way up the call chain unless protected evaluation is used to trap the exception. See the next section for details on protected evaluation.

A legitimate use of the if statement is to terminate execution with an exception. The following snippet would typically reside inside a function body.

{
...
if[a<50; '"Bad a"];
...
}

10.1.8 Protected Evaluation¶

Languages of C++ heritage have the concept of protected execution using try-catch. The idea is that an unexpected condition arising from any statement enclosed in the try portion does not abort the program. Instead, control transfers to the catch block, where the exception can be handled or passed up to the caller. This mechanism allows the call stack to be unwound gracefully.

Q provides a similar capability using ternary forms of Apply and Apply At. Ternary @ is used for unary functions and ternary . is used for multivalent functions. The syntax is the same for both.

@[f_mon;a;expr_fail]

.[f_mul;L_args;expr_fail]

Here f_mon is a unary function, a is single argument, f_mul is a multivalent function, L_args is a list of arguments, and expr_fail is an expression or function. In both forms, the function is applied to its argument(s). Upon successful application, protected evaluation returns the result of the application. Should an exception arise, expr_fail is applied to the resulting error string.

You can use protected evaluation to log error messages from exceptions that would otherwise crash your program.

If the application of expr_fail results in an exception, the protected call itself will fail.

Here is a simple example of using protected evaluation. Suppose a user wishes to enter dynamic q expressions. You could place the expression in a string and pass it to value, which is essentially the q interpreter. This is a huge security exposure and you should never do this in such a naïve fashion in a production system. Nonetheless, we could do it in a learning environment. The problem is that if the user types an invalid q expression, the generated exception will cause your application to halt. To avoid this, apply value with protected evaluation.

q)s:"6*7"
q)@[value; s; show]
42
q)s:"6*`7"
q)@[value; s; show]
"type"

Ternary . provides similar protected evaluation for multi-valent functions.

q)prod:{x*y}
q).[prod; (6;7); show]
42
q).[prod; (6;`7); show]
"type"

10.2 Debugging¶

Debugging in q harkens back to the bad old days, before the advent of debuggers and integrated development environments, when “real men” debugged by inserting println in their code. The q gods don’t give debugging much consideration because their code always runs correctly the first time. There is no debugger, nor any notion of break points or tracing execution. Things aren’t quite as bad as inserting print statements, but we mortals are certainly left to our own devices.

When expression evaluation fails, the console displays a backtick followed by a short and often cryptic error message. This is followed by a dump of the failed operation and the offending values. Many errors manifest as either 'type or 'length, indicating an incompatibility in function arguments somewhere in the bowels of q. The challenge is to discover the root cause of the error.

Update

Debugging has become a bit easier since V3.5. Ed.

Since q is interpreted, at any point in the execution of a program the entire runtime environment is accessible. If you think about it, an integrated debugger for a compiled language essentially simulates the interpreter environment. The debugger must go to great lengths to create the environment that is readily available in an interpreter.

Let’s get real. Say you want to set a breakpoint in your q program. Easy: just insert a line that you know will fail – use an undefined name. For example, to pause execution before the last expression in the function f below, insert any undefined name there – “break” is commonly used.

q)f:{a:x*x; b:y*y; a+b}
q)f:{a:x*x; b:y*y; break; a+b}
q)f[3;4]
{a:x*x; b:y*y; break; a+b}
'break
q))

Make sure the name you choose is not defined in local or global scope.

In a q session, you can tell that execution has been suspended by the extra parenthesis at the q prompt. At this point, you have the full power of the q console available to inspect the current state of your program.

q))x
3
q))y
4
q))a
_
q))b
_
q))a+b
_

Tip

Once you have finished your inspection and debugging, you should either return from the function with a value or abort execution using \. In either case, the extra ) at the q prompt will disappear.


q)):abs
25

A slightly more sophisticated technique allows you to continue execution after the break. Here we cause the break one level lower. A forced return entered at the console completes the breakpoint execution and continues execution of f.

q)breakpoint:{break}
q)f:{a:x*x; b:y*y; breakpoint[]; a+b}
q)f[3;4]
{break}
'break
q)):0 / arbitrary value is not used
25

You can accomplish single-step tracing after suspended execution by copy/paste of one line at a time into the console. Admittedly this is pretty primitive but if you write well-factored q code there shouldn’t be too many lines to copy. This works well enough in practice that it is not a hindrance to finding and correcting bugs.

You will spend much more time trying to figure out why your q code is not doing what you want than the time spent doing manual debugging.

In a technique passed on by Simon Garland, you can get a useful display of relevant information when a function is suspended. Define a function, say zs, as follows,

q)zs:{`d`P`L`G`D!(system"d"),v[1 2 3],enlist last v:value x}

This function takes another function as its argument and returns a dictionary with entries for the current directory, function parameters, local variables referenced, global variables referenced and the function definition. We demonstrate with a trivial example.

q)b:7
q)f:{a:6; x+a*b}
q)f[`100] / this is an error
{a:6; x+a*b}
'type
+
`100
42
q))show zs f
d| `.
P| ,`x
L| ,`a
G| ``b
D| "{a:6; x+a*b}”

This error dump is actually easy to read. The first line is the definition of the function that has failed. Following that is the error message 'type. The operation that generated the error is + and the actual arguments follow. Then our zs gives a nice tabulation of the current context (the root namespace in this case), the parameters of f, the local variables of f, the global variables of f and finally its definition.

A good place to start with zs when you have suspended execution is with the system variable .z.s that holds the suspended function itself.

…
q))zs .z.s
_

10.3 Scripts¶

A script is a q program stored in a text file with an extension of .q (or .k if you are writing k code). A script can contain any q expressions or commands. The contents of the script are parsed and evaluated sequentially from top to bottom. Global entities created during execution of the script exist in the workspace after the script is loaded and executed.

10.3.1 Creating and Loading a Script¶

You can create a script in any text editor and save it with a .q extension. For example, enter the following script that creates the trades table from Chapter 9.

mktrades:{[tickers; sz]
  dt:2015.01.01+sz?31;
  tm:sz?24:00:00.000;
  sym:sz?tickers;
  qty:10*1+sz?1000;
  px:90.0+(sz?2001)%100;
  t:([] dt; tm; sym; qty; px);
  t:`dt`tm xasc t;
  t:update px:6*px from t where sym=`goog;
  t:update px:2*px from t where sym=`ibm;
  t}

trades:mktrades[`aapl`goog`ibm; 1000000]

Now issue the following command to load and execute the script.

q)\l /q4m/trades.q

Verify that the trades table has been created and the records have been inserted.

q)count trades
_

A script can be loaded at any time during a session using the \l command, called load. The load command can be executed programmatically using system. See Chapter 12 for more on commands.

You can have q load a script on startup by placing its name after the call to the q executable on the operating system command line.

$q /q4m/trades.q
KDB+ 3.2 …
q)
q)count trades
1000000

10.3.2 Blocks¶

You can comment out a block of code (i.e., multiple lines) in a script by surrounding it with matching / and \ with each at the beginning of its own line. An unmatched \ at the beginning of a line exits the script.

Here is a script that demonstrates block comments.

a:42
b:0
/
this is a block of
comment text
b:42
and b will not be changed
\
a:43 / this line will be executed
\
nothing from here on will be executed
b:44

Immediately after this script is loaded, a will be 43 and b will be 0.

Multi-line expressions are permitted in a script but they have a special form.

The first line must not be indented – i.e., it begins at the left of the line with no initial whitespace.
Any continuation lines must be indented, meaning that there is at least one whitespace character at the beginning of the line.
In particular, if you put the closing brace to a function definition on its own line, it must be indented. Do not use the common C style of aligning the closing brace with the function name.
Empty lines and comment lines (beginning with /) are permitted anywhere.

Table definition and function definition provide nice opportunities for splitting across multiple lines:

A table can have line breaks after a closing square bracket ] or after a semicolon separator ;
A function can have line breaks after a closing square bracket ] or after a comma separator ,.

10.3.3 Passing Parameters¶

Parameters are passed to a q script at q startup similarly to argv command line parameters in C. Specifically, the system variable .z.x comprises a list of strings, each containing the character representation of an argument present on the command line that invoked the script. For example, let’s modify our trades.q script to pass the number of records to be created as a command line parameter. Note that we parse the passed string to an integer.

mktrades:{[tickers; sz]
  dt:2015.01.01+sz?31;
  tm:sz?24:00:00.000;
  sym:sz?tickers;
  qty:10*1+sz?1000;
  px:90.0+(sz?2001)%100;
  t:([] dt; tm; sym; qty; px);
  t:`dt`tm xasc t;
  t:update px:6*px from t where sym=`goog;
  t:update px:2*px from t where sym=`ibm;
  t}

size:"I"$.z.x 0

trades:mktrades[`aapl`goog`ibm; size]

Now we invoke the script with the parameter 2000000.

>q /q4m/trades.q 2000000
KDB+ 3.2 …
q)count trades
2000000

As of this writing (Sep 2015), parameters can be passed when a script is loaded at q startup but not when a script is loaded with \l or system “l”.

If you put any extra space immediately after \l you will get an error.