White paper

# Iterators¶

Iterators (formerly known as *adverbs*) are the primary means of iteration in q, and in almost all cases the most efficient way to iterate. Loops are rare in q programs and are almost always candidates for optimization. Mastery of iterators is a core q skill.

The first part of this paper introduces iterators informally. This provides ready access to the two principal forms of iteration: *maps* and *accumulators*.

The second part of the paper reviews iterators more formally and with greater attention to syntax. We see how iterators apply not only to functions but also to lists, dictionaries and tables. From their syntax we see when parentheses are required, and why.

We discuss examples of their use.

## Basics¶

Iterators are higher-order unary operators: they take a single argument and return a derived function. The single argument is an *applicable value*: a list, dictionary, table, process handle, or function. The derived function iterates its normal application.

Iterators are the only operators that can be applied postfix. They almost always are.

For example, the iterator Scan, written `\`

, applied to the Add operator `+`

derives the function Add Scan, written `+\`

, which extends Add to return cumulative sums.

```
q)+\[2 3 4]
2 5 9
```

Applied to the Multiply operator `*`

it derives the function Multiply Scan, written `*\`

, which returns cumulative products.

```
q)*\[2 3 4]
2 6 24
```

(Writers of some other programming languages might recognize these uses of Scan as *fold*.)

Another example. The iterator Each, written `'`

, applied to the Join operator `,`

, derives the function Join Each, written `,'`

.

```
q)show a:2 3#"abcdef"
"abc"
"def"
q)show b:2 3#"uvwxyz"
"uvw"
"xyz"
q)a,b
"abc"
"def"
"uvw"
"xyz"
q)a,'b
"abcuvw"
"defxyz"
```

Above, `a`

and `b`

are both 2×3 character matrixes. That is to say, they are both 2-lists, and their items are character 3-lists. While `a,b`

joins the two lists to make a 4-list, the derived function Join Each `a,'b`

joins their corresponding items to make two character 6-lists.

Scan and Each are the cores of the accumulator and map iterators. The other iterators are variants of them.

## Three kinds of iteration¶

### Atomic iteration¶

Many native q operators have iteration built into them. They are atomic. They apply to conforming arguments.

```
q)2+2 / two atoms
4
q)2 3 4+5 6 7 / two lists
7 9 11
q)2+5 6 7 / atom and list
7 8 9
```

Two arguments conform if they are lists of the same length, or if one or both is an atom. In atomic iteration this definition recurses *to any depth of nesting*.

```
q)(1;2;3 4)+( (10 10 10;20 20); 30; ((40 40; (50 50 50; 60)); 70) )
(11 11 11;21 21)
32
((43 43;(53 53 53;63));74)
```

Because atomic iteration burrows deep into nested structure it is not easy to parallelize. A simpler form of it is.

### Mapping¶

The map iterators apply a function across items of a list or lists. They do not burrow into a nested structure, but simply iterate across its top level.
That is just what **Each** does.

```
q)x:("the";"quick";"brown";"fox")
q)count[x] / count x
4
q)count'[x] / count each item of x
3 5 5 3
```

The Each iterator has four variants. A function derived by **Each Right** `/:`

applies its entire left argument to each item of its right argument. Correspondingly, a function derived by **Each Left** `\:`

applies its entire right argument to each item of its left argument.

```
q)"abc",/:"xyz" / Join Each Right
"abcx"
"abcy"
"abcz"
q)"abc",\:"xyz" / Join Each Left
"axyz"
"bxyz"
"cxyz"
```

Each Left and Each Right

Remember which is which by the direction in which the slash leans.

**Each Prior** takes a binary value as its argument. The derived function is unary: it applies the binary between each item of a list (or dictionary) and its preceding item.
The differences between items in a numeric or temporal vector:

```
q)-':[1 1 2 3 5 8 13 21 34] / Subtract Each Prior
1 0 1 1 2 3 5 8 13
```

**Each Parallel** takes a unary argument and applies it, as Each does, to each item in the derived function’s argument. Unlike Each, it partitions its work between any available secondary processes. Suppose `analyze`

is CPU-intensive and takes a single symbol atom as argument.

`q)analyze':[`ibm`msoft`googl`aapl]`

With a unary function, the mnemonic keyword `each`

is generally preferred as a cover for the iterator Each. Similarly, `prior`

is preferred for Each Prior and `peach`

for Each Parallel.

```
q)count each ("the";"quick";"brown";"fox")
3 5 5 3
q)(-) prior 1 1 2 3 5 8 13 21 34
1 0 1 1 2 3 5 8 13
q)analyze peach `ibm`msoft`googl`aapl
..
```

With map iterators the number of evaluations is the number of top-level items in the derived function’s argument/s. These functions are right-uniform.

The map iterators:

glyph | name | mnemonic keyword |
---|---|---|

`'` |
Each | `each` |

`\:` |
Each Left | |

`/:` |
Each Right | |

`':` |
Each Prior | `prior` |

`':` |
Each Parallel | `peach` |

`'` |
Case |

### Accumulation¶

In accumulator iterations the value is applied repeatedly, first to the entire (first) argument of the derived function, next to the result of that evaluation, and so on.

The number of evaluations is determined according to the value’s rank.

For a **unary** value, there are three forms:

- Converge: iterate until a result matches either the previous result or the original argument
- Do: iterate a specified number of times
- While: iterate until the result fails a test

```
q){x*x}\[0.1] / Converge
0.1 0.01 0.0001 1e-08 1e-16 1e-32 1e-64 1e-128 1e-256 0
q)5{x*x}\0.1 / Do
0.1 0.01 0.0001 1e-08 1e-16 1e-32
q)(1e-6<){x*x}\0.1 / While
0.1 0.01 0.0001 1e-08
```

For values of **higher-rank** the number of evaluations is the count of the right argument/s. For example, the result `r`

of applying a ternary derived function `f\`

to arguments `x`

, `y`

, and `z`

:

```
r[0]:f[x; y 0; z 0]
r[1]:f[r 0; y 1; z 1]
r[2]:f[r 1; y 2; z 2]
..
```

From this we see that the right arguments `y`

and `z`

must conform and that `count r`

– the number of evaluations – is `count[y]|count[z]`

.

There are two accumulator iterators.

- Functions derived by
**Scan**`\`

return as a list the results of each evaluation. They are thus right-uniform functions: their results conform to their right arguments. They resemble*fold*in some other programming languages. - Functions derived by
**Over**`/`

perform the same computation as those from Scan, but return only the last result. They resemble*map reduce*in some other programming languages.

```
q)+\[2 3 4] / Add Scan
2 5 9
q)+/[2 3 4] / Add Over
9
```

With Scan and Over and binary values, the mnemonic keywords `scan`

and `over`

are generally preferred.

```
q)(+) scan 2 3 4
2 5 9
q)(+) over 2 3 4
9
```

The accumulators:

glyph | name | mnemonic keyword |
---|---|---|

`\` |
Scan | `scan` |

`/` |
Over | `over` |

## Brackets and parentheses¶

The result of applying an iterator to a value is a derived function. Like any other function, a derived function can be applied with brackets.

```
q)+\[3 4 5]
3 7 12
q)+\[1000;3 4 5]
1003 1007 1012
```

Notice that the derived function here is *variadic*: it can be applied as a unary or as a binary.

Postfix yields infix

An iterator applied postfix derives a function with infix syntax.

This is true regardless of the derived function’s rank.
For example, `count'`

is a unary function but has infix syntax.

We can also apply `+\`

as a binary using infix syntax.

```
q)1000+\3 4 5
1003 1007 1012
```

The syntax of q allows clear expression of a sequence of operations.
To apply a function `primus`

to `x`

, then `secundus`

to the result, then `tertius`

to the result of that, we could write:

`q)tertius[secundus[primus[x]]]`

but better style would be to apply the unaries prefix and write:

`q)tertius secundus primus x`

Good q style minimizes use of both brackets and parentheses. (Where they must be used, the less that appears between an opening bracket or parenthesis and its close, the better.) So prefix application is usually better.

An infix function can be applied prefix as a unary by parenthesizing it.

```
q)(+\)3 4 5 6 7
3 7 12 18 25
```

Some common derived functions are covered by keywords for readability. Good q style prefers them.

```
q)sums 3 4 5 6 7
3 7 12 18 25
```

The iterator Each is covered by the keyword `each`

for unary values.
Good q style prefers it.

```
q)(count') ("the";"quick";"brown";"fox")
3 5 5 3
q)count each("the";"quick";"brown";"fox") / better q style
3 5 5 3
```

Parenthesize an infix function to use it as the left argument of a another function.

```
q)(+) scan 3 4 5 6 7
3 7 12 18 25
```

## Map iterators¶

### Each, Each Parallel¶

The Each iterator applies a unary to each item of an argument.

```
q)x:("the";"quick";"brown";"fox")
q)reverse x
"fox"
"brown"
"quick"
"the"
q)reverse each x
"eht"
"kciuq"
"nworb"
"xof"
```

*Each Both*

With a binary value, the iterator is sometimes known as *each both*.
You can think of it as a zip fastener, applying the value between pairs of items from its arguments.

```
q)2 0 1 -2 rotate' x
"eth"
"quick"
"rownb"
"oxf"
q)show m:4 5#"the quickbrownfox "
"the "
"quick"
"brown"
"fox "
q)1 2 1 1 1 0 0 m' 3 1 2 3 4 2 0 / scattered indexing
"cricket"
```

The Each Parallel iterator takes unary values. It derives functions that perform exactly the same computation as functions derived by Each, but delegates computation to secondary tasks, if any are available.

Good q style prefers use of the `peach`

keyword.

```
q)sum peach 3?'5000#10
13 12 13 22 3 14 17 14 7 12 13 17 19 15 8 16 17 18 19 10 16 10 9 13 15 8 25 8..
```

The Each iterator has three variants that take binary values as arguments: Each Left, Each Right and Each Prior.

### Each Left, Each Right¶

With a function derived from Each and a binary, if one of the two arguments is an atom, it is paired with each item of the other argument.

```
q)3,'til 4
3 0
3 1
3 2
3 3
q)"o" in' ("the";"quick";"brown";"fox")
0011b
q)2 m' 0 2 1 2 4
"boron"
```

*Each Right*

To extend this behavior to non-atom arguments, use Each Left or Each Right.

```
q)x:("the";"quick";"brown";"fox")
q)y:("brown";"windsor";"soup")
q)z:("red";"riding";"hood")
q)"brown" in/: (x;y;z) / Each Right
110b
```

For example, find the file handle of each column of a table.

```
q){x,/:key[x]except `.d} `:/mydb/2013.05.01/trade
`:/mydb/2013.05.01/trade`sym
`:/mydb/2013.05.01/trade`time
`:/mydb/2013.05.01/trade`price
`:/mydb/2013.05.01/trade`size
`:/mydb/2013.05.01/trade`ex
```

The file handle of the table is joined to each element in the list of columns, creating five 2-lists. The Each Right iterator can then apply the `sv`

keyword to create the file handles of each column.

```
q)` sv/: {x,/:key[x]except `.d} `:/mydb/2013.05.01/trade
`:/mydb/2013.05.01/trade/sym`:/mydb/2013.05.01/trade/time`:/mydb/2013.05.01/t..
```

### Each Prior¶

The Each Prior iterator applies its binary to each item of a list `x`

and to the previous item; i.e. to each adjacent pair in the list. The result is a list of the same length as `x`

: the derived function is a uniform function.

```
q)(-':) 4 8 3 2 2
4 4 -5 -1 0
```

Good q style prefers use of the `prior`

keyword.

```
q)(-) prior 4 8 3 2 2
4 4 -5 -1 0
```

One use of `-':`

is so common it is built in as the `deltas`

keyword.

```
q)deltas 4 8 3 2 2
4 4 -5 -1 0
```

As a uniform function `-':`

returns a list as long as its argument.
The first item of the result corresponds to the first item of the argument.
But, by definition, the first item of the argument has no previous item.
So, in the expression \(4-y\), what is \(y\)? Above, it is zero. Zero is the identity element for Subtract: when \(y\) is zero, \(x-y\) is always \(x\). (See Each Prior in the Reference for more on this and what happens with values which do not have a known identity element.)

We can use `{x,y}`

to display the pairs that Each Prior finds.

```
q){x,y}prior til 5
0
1 0
2 1
3 2
4 3
```

Here we see that the first item, `0`

, is paired with `0N`

. The Join operator has no identity element, so it uses the argument `til 5`

as a prototype.

```
q)(0,1#0#x) ~ first {x,y}':[x]
1b
```

A table in the Reference tells us that with a binary value Each Prior derives a variadic function. So Subtract Each Prior can also be applied as a binary.

```
q)0 -': 4 8 3 2 2
4 4 -5 -1 0
```

The zero left argument is the ‘seed’ – the \(y\) subtracted from the first item, 4. We can use another ‘seed’ value.

```
q)1 -': 4 8 3 2 2
3 4 -5 -1 0
```

**Watch out** Using a float as the seed shifts the type of the first item of the result.
But *only* the first item: the result is no longer a vector, but a mixed list.

```
q)0.5 -': 4 8 3 2 2
3.5
4
-5
-1
0
q)type each 0.5 -': 4 8 3 2 2
-9 -7 -7 -7 -7h
```

Each Prior can be useful in tracking down errors within lists which should be identical, e.g. the `.d`

files for a table in a partitioned database.
Let’s use the `differ`

keyword to check for inconsistencies in `.d`

files.
It uses the Each Prior iterator and is equivalent to `{not(~':)x}`

.

```
q)1_ date where differ {get hsym `$"/mydb/",x,"/trade/.d"} each string date
2013.05.03 2013.05.04
```

The values of the `.d`

files are extracted from each partition.
The `differ`

keyword then compares each item with the one before it.
If a `.d`

file differs from the previous `.d`

file in the list, then that date will be returned.
The first item of the result is dropped, because the first item of list `x`

will be compared to `x[-1]`

, which is always null and so will never match.
In the example above, the `.d`

files for the `2013.05.03`

and `2013.05.04`

partitions differ, and should be investigated further.

### Higher ranks¶

Each Parallel, `peach`

, and `each`

apply unary values.
Each Left, Each Right, Each Prior, and `prior`

apply binary values.

The Each iterator applies values of any rank.

```
q)1 2 3 in' (1 2 3;3 4 5;5 6 7)
100b
q)ssr'[("mad";"bud";"muy");"aby";"umd"]
"mud"
"mud"
"mud"
```

As with atomic iteration, the arguments must conform: if lists, their lengths match; if atoms, they are replicated to the length of the list/s.

```
q)ssr'[("mad";"bud";"muy");"d";"pnx"]
"map"
"bun"
"muy"
```

## Accumulating iterators¶

There are two accumulating iterators (or *accumulators*) and they are really the same. The Scan iterator is the core; the Over iterator is a slight variation of it.

Here is Scan at work with `ssr`

.

```
q)ssr\["hello word."; ("h";".";"rd"); ("H";"!";"rld")]
"Hello word."
"Hello word!"
"Hello world!"
```

Where we want only the final result, we use the Over iterator.

```
q)ssr/["hello word."; ("h";".";"rd"); ("H";"!";"rld")]
"Hello world!"
```

It is generally true that for a value `v`

, `v/[x]`

is the same as `last v\[x]`

. Otherwise, what is true for Scan is true for Over. (Using Over rather than Scan allows kdb+ to use a little less memory by dropping interim results.)

Debugging

If puzzled by the result of using Over, replace it with Scan and examine the intermediate results. They are usually illuminating.

Consider the application of the functions Scan derives from values of ranks 1, 2, 3…

```
v1\[x] (v1\)x Converge
v1\[i;x] i v1\ x Do
v1\[t;x] t v1\ x While
v2\[x;y] x v2\ y
v3\[x;y;z]
…
```

And so on, up to `f8\`

. In each form, `x`

is the (first) argument of the first evaluation. The result of the first evaluation becomes the (first) argument for the next evaluation, if any. And so on.

For how many evaluations? It depends first on the rank of the value.

value rank | number of evaluations |
---|---|

2-8 | length of the right argument/s |

1 | depends on the results, and the left argument (if any) of the derived function |

If `f`

is rank 2 or higher, the number of evaluations is determined by the length of the right argument/s. In the `ssr`

example above, the right arguments have length 3 and `ssr`

is evaluated three times.

There are three ways to determine the number of evaluations of `f1`

performed by `f1\`

.

A table in the Reference tells us `f1\`

is variadic and can be applied as a unary (Converge) or as a binary (Do or While).

### Converge¶

Applying `f1\`

as a unary is known as Converge because `f1\`

is evaluated until the result matches either the previous result or the original argument.

```
q)({x*x}\)0.1 / converge on result
0.1 0.01 0.0001 1e-08 1e-16 1e-32 1e-64 1e-128 1e-256 0
q)(not\)0b / return to first argument
01b
```

**Watch out** Not all sequences converge in this way. You can use the accumulators to throw kdb+ into an infinite loop.

Set a timeout

When developing with Converge, it is wise to set the timeout in your session via the `\T`

command. This will cause any functions to terminate after a set number of seconds. An infinite loop will not lock your session.

### Recursion with `.z.s`

¶

The ‘self’ function `.z.s`

can also be used for recursion, and is more flexible than Scan.

```
q)list:(`a`n;(1 2;"efd");3;("a";("fes";3.4)))
q){$[0h=type x;.z.s'[x];10h=abs type x;upper x;x]}list
`a`n
(1 2;"EFD")
3
("A";("FES";3.4))
```

The above function operates on a list of any structure and data types, changing strings and characters to upper case and leaving all other elements unaltered.

Note that when using `.z.s`

the function will signal a stack error after 2000 loops.

`{.z.s[0N!x+1]}0`

No such restriction exists on Scan and Over.

Use `.z.s`

only where it is not possible to use iterators.

### Do and While¶

Applying `f1\`

as a binary allows us to set an exit condition as the left argument:

- an integer: exit after this number of evaluations (Do)
- a truth function: apply this to each result and exit when it evaluates to zero (While)

Consider the function defined and illustrated below.

```
q)30 {3.2*x*1-x}\ 0.4
0.4 0.768 0.5701632 0.7842468 0.541452 0.7945015 0.5224603 0.7983857 0.515091..
```

It is evident the function results in a loop with period 2 (at least within floating point tolerance). Without the exit condition (above, 30 evaluations) it will not terminate.

Fibonacci numbers are calculated by joining an integer vector to the sum of its last two items.

```
q)fib:{x,sum -2#x}
q)5 fib\1 1 / exit after 5 evaluations
1 1
1 1 2
1 1 2 3
1 1 2 3 5
1 1 2 3 5 8
1 1 2 3 5 8 13
q){last[x]<200} fib/1 1 / exit when result exceeds 200
1 1 2 3 5 8 13 21 34 55 89 144 233
```

## Lists and dictionaries¶

The arguments of iterators are applicable values: functions, file- and process handles, lists and dictionaries. Functions are the most familiar as iterator arguments, but lists and dictionaries reward study.

```
q)yrp / a European tour
from to wp
----------------
London Paris 0
Paris Genoa 1
Genoa Milan 1
Milan Vienna 1
Vienna Berlin 1
Berlin London 0
q)show route:yrp[`from]!yrp[`to]
London| Paris
Paris | Genoa
Genoa | Milan
Milan | Vienna
Vienna| Berlin
Berlin| London
```

The dictionary `route`

is a finite-state machine: its values are also valid keys.

```
q)(route\)`Genoa / a circular tour
`Genoa`Milan`Vienna`Berlin`London`Paris
q)3 route\`London / 3 legs of the tour
`London`Paris`Genoa`Milan
q)(`Berlin<>)route\`Paris / Paris to Berlin
`Paris`Genoa`Milan`Vienna`Berlin
q)waypoints:(!/)yrp`from`wp
q)waypoints route\`Paris / Paris to the end
`Paris`Genoa`Milan`Vienna`Berlin
```

In the last expression, both the value and the truth value are dictionaries. No functions are involved.

## Combining iterators¶

We can calculate Pascal’s Triangle using Scan and Each Prior.

We already have a sufficient grasp of the accumulators to see the Triangle immediately as successive results from some use of Scan. We need only a function to define one row of the Triangle in terms of the row above it. That could hardly be simpler. Each row is derived from its parent by summing adjacent pairs of items.

```
q)(+) prior 1 3 3 1 / nearly...
1 4 6 4
q)(+) prior 1 3 3 1,0 / ...there!
1 4 6 4 1
q)7 {(+)prior x,0}\ 1
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 35 35 21 7 1
```

Notice that the last expression gave us *eight* rows of the Triangle, not seven.
The first item of the result was the original argument, followed by the results of seven successive evaluations.

If the left argument of the derived function is zero, there will be no evaluations. The original argument will still be returned as the first (and only) item of the result. It doesn’t even have to be in the domain of the value.

```
q)0 {(+)prior x,0}\ 1
,1
q)0 {(+)prior x,0}\ "foo"
"foo"
```

You can use Do in this way as a conditional.

The argument of an iterator is a function, list, or dictionary. Its result is a derived function – so that too can be the argument to an iterator.

Generate all possible pairs from two lists:

```
q)raze 1 2 3,/:\:4 5 6
1 4
1 5
1 6
2 4
2 5
2 6
3 4
3 5
3 6
```

Here the derived function `,/:`

(Join Each Right) is the argument to the iterator `\:`

(Each Left).
The resulting binary derived function `,/:\:`

(Join Each Right Each Left) has infix syntax – remember *Postfix yields infix*. In the example above it returns a 3×3 matrix of pairs, here razed into a list of pairs because the default display does not reveal the matrix structure.

Switch the iterators to change the order of the result.

```
q)raze 1 2 3,\:/:4 5 6
1 4
2 4
3 4
1 5
2 5
3 5
1 6
2 6
3 6
```

The `raze`

keyword is no more than `,/`

(Join Over). It is often combined with Converge to flatten deeply nested structures. Good q style prefers the keywords, but `,//`

illustrates again how a derived function can be the argument to another iterator.

```
q)raze over ("the ";("quick ";("brown ";"fox.")))
"the quick brown fox."
q)(,//)("the ";("quick ";("brown ";"fox.")))
"the quick brown fox."
```

You can use the Each operator to apply a function at a specific depth in a nested list.

```
q)lst:(3 2 8;(3.2;6h);("AS";4))
q)type lst
0h
q)type'[lst]
7 0 0h
q)type''[lst]
-7 -7 -7h
-9 -5h
10 -7h
q) type'''[lst]
-7 -7 -7h
-9 -5h
(-10 -10h;-7h)
```

Good q style avoids brackets and parentheses, and prefers keywords where available. These principles conflict when composing multiple iterators.

```
q)((type each)each)each lst / NOT recommended
-7 -7 -7h
-9 -5h
(-10 -10h;-7h)
```

When composing iterators, prefer concise forms. They are easier to analyze.

## Iterators versus loops¶

The control words `do`

and `while`

allow q programmers to write explicit loops. Sometimes this is the only way to write an iteration. However, most common forms of iteration can be defined using the iterators, yielding code that is shorter, faster and less prone to error.

Often the implementation is relatively easy, using Each, Each Left and Each Right to cycle through a list and amend its items.

Suppose we wanted to check if either of the integers 2 or 3 are present in some lists. This can be achieved with a `while`

loop.

```
q)m:(1 2 3;3 4 5;4 5 6) / three lists
q){i:0;a:();while[i<count x;a,:enlist any 2 3 in x[i];i+:1];a} m
110b
q)\t:100000 {i:0;a:();while[i<count x;a,:enlist any 2 3 in x[i];i+:1];a} m
475
```

However, iterators allow neater, more efficient code; easier to read and cheaper to maintain.

```
q)any each 2 3 in/: m
110b
q)\t:10000 any each 2 3 in/: m
30
```

Similarly we can use the Over iterator to deal easily with situations which would be handled by loops in C-like languages.

Suppose you wanted to join several tables.

```
//Create a list of tables, of random length
q)tt:{1!flip(`sym;`$"pr",x;`$"vol",x)!(`a`b`c;3?50.0;3?100)}each string til 2+rand 10
//Join the tables using a while loop
q){a:([]sym:`a`b`c);i:0;while[i<count[x];a:a lj x[i];i+:1];a}tt
sym pr0 vol0 pr1 vol1 pr2 vol2 pr3 vol3 pr4 vol4 pr..
-----------------------------------------------------------------------------..
a 25.41992 86 3.315151 58 22.37118 81 7.696889 56 14.84522 80 27..
b 2.124254 50 19.3025 82 34.57479 69 47.95351 85 0.5641467 45 33..
c 37.36208 19 41.11905 31 33.52813 90 30.03506 78 1.949377 8 28..
q)\t:10000 {a:([]sym:`a`b`c);i:0;while[i<count[x];a:a lj x[i];i+:1];a}tt
271
//Join the tables using Over
q)0!(lj/)tt
sym pr0 vol0 pr1 vol1 pr2 vol2 pr3 vol3 pr4 vol4 pr..
-----------------------------------------------------------------------------..
a 25.41992 86 3.315151 58 22.37118 81 7.696889 56 14.84522 80 27..
b 2.124254 50 19.3025 82 34.57479 69 47.95351 85 0.5641467 45 33..
c 37.36208 19 41.11905 31 33.52813 90 30.03506 78 1.949377 8 28..
q)\t:10000 0!(lj/)tt
257
```

Write loops only when you can find no solution using iterators.

## Nested columns¶

Best practice avoids nested columns wherever possible.
However in some situations operating on nested data is necessary or may lower execution time for certain queries.
The most common occasion for this is that the keyword `ungroup`

, which flattens a table containing nested columns, is computationally expensive, especially when you are dealing only with a subset of the entire table.

There are also situations in which storing the data in a nested structure makes more sense. For example, you may want to use strings instead of symbols, to avoid a bloated sym file.

So we will now take a look at using iterators to apply functions to a table as a whole, and to apply functions within a select statement.

Iterators can be used to examine and modify tables. To do this we need to understand how tables are structured. In kdb+, a table is a list of dictionaries, which may have non-integer values. This means we can apply functions to individual values, just like any other nested list or dictionary structure. For example:

```
q)a:([]a:`a`b`c`d;b:1 2 3 4;c:(1 2;2 3;3 4;4 5))
q)type[a]
98h
q)type'[a]
99 99 99 99h
q)type''[a]
a b c
--------
-11 -7 7
-11 -7 7
-11 -7 7
-11 -7 7
```

We see here that `type[a]`

returns `98h`

, so `a`

is a table.
Then `type'[a]`

returns the type of each item of `a`

– they are dictionaries, with type `99h`

.
Next, `type''[a]`

finds the type of each value of each dictionary. The result is a list of dictionaries, which collapses back to a table showing the type of each field in the table.

```
q)distinct type''[a]
a b c
--------
-11 -7 7
```

This statement can be used to ensure all rows of the table are the same type. This is useful if your table contains nested columns, as the `meta`

keyword looks only at the first row of nested columns. And if the table is keyed then `meta`

will be applied only to the non-key columns.

```
q)a:([]a:`a`b`c`d;b:1 2 3 4;c:(1 2;2 3;3 4.;4 5))
q)meta a
c| t f a
-| -----
a| s
b| j
c| J
q)distinct type''[a]
a b c
--------
-11 -7 7
-11 -7 9
```

Looking only at the results of `meta`

, we might suppose column `c`

contains only integer lists.
However `distinct type''[a]`

clearly shows column `c`

contains lists of different types, and thus is not mappable. This is a common cause of error when writing to a splayed table.
Dealing with nested data in a table via a select/update statement often requires the use of iterators.
To illustrate this, let us define a table with three columns, two of which are nested.

```
q)show tab:([]sym:`AA`BB`CC;time:3#enlist 09:30+til 30;price:{30?100.0}each til 3)
sym time ..
-----------------------------------------------------------------------------..
AA 09:30 09:31 09:32 09:33 09:34 09:35 09:36 09:37 09:38 09:39 09:40 09:41 0..
BB 09:30 09:31 09:32 09:33 09:34 09:35 09:36 09:37 09:38 09:39 09:40 09:41 0..
CC 09:30 09:31 09:32 09:33 09:34 09:35 09:36 09:37 09:38 09:39 09:40 09:41 0..
```

Suppose we wanted the range of each row.

`q)rng:{max[x]-min[x]}`

We can use `rng`

with Each within a `select`

statement to apply the function to each row of the table.

```
q) select sym, rng'[price] from tab
sym price
------------
AA 96.3872
BB 95.79704
CC 98.31252
```

Suppose instead we wanted the range of a subset of the data in the table.
One way would be to `ungroup`

the table and find the range.

```
q)select rng price by sym from ungroup tab where time within 09:40 09:49
sym| price
---| --------
AA | 77.67457
BB | 80.14611
CC | 67.48254
```

However, it is faster to index into the nested list as this avoids the costly `ungroup`

function.
First find the index of the prices within our time range.

`q)inx:where (exec first time from tab) within 09:40 09:49`

Then use this to index into each price list and apply `rng`

to the resulting prices.

```
q)select sym, rng'[price@\:inx] from tab
sym inx
------------
AA 77.67457
BB 80.14611
CC 67.48254
```

This offers a significant improvement in latency over using `ungroup`

.

```
q)\t:10000 select rng price by sym from ungroup tab where time within 09:40 09:49
175
q)\t:10000 inx:where (exec first time from tab) within 09:40 09:49;select sym, rng'[price@\:inx] from tab
83
```

If the nested lists are not uniform, change the code:

```
q)inx:where each (exec time from tab) within 09:40 09:49
q)select sym, rng'[price@'inx] from tab
sym inx
------------
AA 77.67457
BB 80.14611
CC 67.48254
```

## Conclusion¶

This white paper summarizes the q iterators, showing how they derive new functions from values – functions, file- and process handles, lists and dictionaries.

It showed with examples how the effect of the iterator is determined sometimes by the rank of the applicable value, and sometimes also by the rank at which the derived function is applied. Even complicated examples, composing multiple iterators, can be analyzed to understand the behavior.

Certain uses of iterators, such as the creation of recursive functions and applying iterators to functions within select statements, were examined in more detail, as these are often poorly understood, but useful in many situations. Some common uses were looked at in to demonstrate the ability of iterators to reduce execution times.

This white paper illustrates how iterators can be used easily to extend the application of inbuilt and user-defined functions, allowing code to take full advantage of kdb+’s ability to process large volumes of data quickly.

Using iterators correctly minimizes the amount of data manipulation and produces code that is concise and easier to maintain.

All tests were run using kdb+ 3.6 (2018.10.23).

## Authors¶

**Conor Slattery** is a financial engineer who has designed kdb+ applications for a range of asset classes. Conor is currently working with a New York-based investment firm, developing kdb+ trading platforms for the US equity markets.

An earlier version of this paper by Slattery was published in 2013 as “Efficient use of adverbs”.

**Stephen Taylor** FRSA has followed the evolution of the Iversonian languages through APL, J, k, and q, and is a former editor of *Vector*, the journal of the British APL Association.

Other papers by Stephen Taylor