QforMortals/functions
Contents |
Functions
Overview
In this chapter, we cover functions in depth. Before starting, you may wish to review the mathematical functions refresher if it has been a while since your last encounter with mathematical functions.
Function Specification
The notion of a function in q corresponds to a (mathematical) map that is specified by an algorithm. A function is a sequence of expressions to be evaluated, having optional input parameters and a return value. Application of a function is the process of evaluating the expressions in sequence, substituting actual arguments for any formal parameters. If a return value is specified, the function evaluates to its return value.
Because a q function can access global variables, the corresponding mathematical mapping actually includes the workspace as an implicit parameter.
Function Definition
The distinguishing characteristic of function definition is a matching pair of braces { and } enclosing a sequence of expressions separated by semi-colons. In contrast to verbose languages, a function's input parameters and the return value are not typed. In fact, they don’t even need to be declared explicitly. Even the function name is optional.
Here is a full specification of a squaring function that returns the square of its input,
f{[x] x*x}
You call f by enclosing its actual parameter in square brackets,
f[3] 9
Here is the most compact form of an equivalent function evaluation in which optional aspects are omitted,
{x*x}[5] 25
Function Notation and Terminology
The notation for function definition is,
- {[p_{1};...;p_{n}] e_{1};...;e_{n}}
where the optional p_{1};...;p_{n} are formal parameters and e_{1};...;e_{n} is a sequence of expressions to be evaluated.
The number of formal input parameters, either explicitly defined or used implicitly, is the function's valence. Most common are monadic (valence 1) and dyadic (valence 2). You can specify a function with no parameters (niladic) with an empty argument list
- {[]...}
The maximum valence currently permitted is 8, so specifying more than eight arguments will cause an error. You can circumvent this restriction by encapsulating multiple parameters in a list and using the list as an argument.
Variables that are defined within the expression(s) of a function are called local variables.
The return value of a function is determined by the following rules. If an empty assignment appears—i.e., a ':' with no variable name to the left —then its assignment value is the return value. Otherwise, if any local variables are assigned, the assigned value of the last one is the return value. Otherwise, the return value is the result of the last expression evaluation. For example, the following function specifications result in the same input-output mapping,
f1{[x] :x*x} / explicit return f2{[x] r:x*x} / local variable is returned f3{[x] x*x} / last expression is result
And so does this one, even though it includes useless and unexecuted evaluations,
f4:{[x] a:1;:x*x;3}
Implicit Parameters
If you omit the formal parameters and their brackets, three implicit positional parameters x, y and z are automatically available to the function's expressions. Thus, the following two specifications are equivalent,
f:{[x] x*x} g:{x*x}
And so are,
f:{[x;y] x+y} g:{x+y}
When using implicit parameters, x is always the first actual argument, y second and z third. The following function g generates an error unless it is called with three parameters,
g:{x+z} / likely meant x+y; requires 3 parms in call g[1;2] / error...needs three parameters {z+z}[1;2] g[1;2;3] / OK...value 2 is required but ignored 4
Anonymous Functions
A function can be defined without being assigned to a variable. Such a function is called anonymous since it cannot be evaluated by name.
{[x;y] x+y}[4;5] 9
An anonymous function can be appropriate when it will be evaluated in only one location. A prevalent use is in-line helper functions inside other functions,
f{[...] ...; {...}[...]; ...}
It is arguably more readable to extract anonymous functions,
g:{...} f:{...; g[...]; ....}
This is a matter of coding style.
Local and Global Variables
Local Variables
A variable that is defined by assignment in an expression in a function is called a local variable. For example, a is a local variable in the following function,
f:{a:42; a+x}
A local variable exists only from the time it is first assigned until the completion of the enclosing function's evaluation; it has no value until it is actually assigned. Provided there is no variable a already assigned in the workspace, evaluation of the function does not create such a variable. Using f as above,
f[6] 48 a `a
Global Variables
Variables that have been assigned outside any function definition are called global variables in this context. Global variables can be accessed inside a function.
b:6 f:{x*b} f[7] 42
To assign a global variable inside a function, use a double colon (::), which tells the interpreter not to create a local variable with the same name.
b:6 f:{b::7: x*b} f[6] 42 b 7
Local and Global Collision
When a local variable is defined with the same name as a global variable, the global variable is obscured.
a:42 f:{a:98; x+a} f[6] 104 a 42
When local and global names collide, the global variable is always obscured. Even double colon assignment affects the local variable. For example,
a:42 f:{a:6;a::98; x*a} f[6] 588 a 42
Amend (:)
Amend in C Language
We have already seen the basic form of assignment using amend
a:42
Programmers from languages with C heritage will be familiar with expressions such as
x += 2; // C expression representing amend
which is a shorthand form of
x = x + 2; // C expression
This is usually read simply "add 2 to x" but more precisely is, "assign to x the result of adding 2 to the current value of x." This motivates the interpretation of such an operation as "amend", in which x is re-assigned the value obtained by applying the operation + to the operands x and 2. This implies that to amend a variable it must have been previously assigned.
Simple Amend
In q, the equivalent to the above C expression uses +: as the operator.
x:42 x+:2 x 44
There is nothing special about + in the above discussion. Amend is available with any binary verb, as long as the operand types are compatible.
a:42 a-:1 a 41
We shall see interesting examples of amend with other operators in later chapters.
Amend with Lists
This capability to amend in one step extends to lists and indexing,
L1:100 200 300 400 L1[1]+:9 L1 100 209 300 400 L1[0 2]+:99 L1 199 209 399 400 L1[0 1 2]+:1 2 3 L1 200 202 402 400 L2:(1 2 3; 10 20 30) L2[;2]+:9 L2 (1 2 3;19 29 39) L2[0;1]+:100 L2 (1 102 3; 19 29 39)
Amend enforces strict type matching with simple lists, since the result must be placed back into the list
L1[0]+:42h 'type
Projection
Function Projection
Sometimes a function of valence two or more is evaluated repeatedly while some of its arguments are held constant. For this situation, a multivalent function can have one or more arguments fixed and the result is a function of lower valence called the projection of the original function onto the fixed arguments. Notationally, a projection appears as a function call with the fixed arguments in place and nothing in the other positions.
For example, the dyadic function which returns the difference of its arguments,
diff:{[x;y] x-y}
can be projected onto the first argument by setting it to 42, written as,
diff[42;]
The projected function is the monadic function "subtract from 42",
diff[42;][6] 36
This projection is equivalent to,
g{[x] 42-x} g[6] 36
We can also project diff onto its second argument to get "subtract 42",
diff[;42] diff[;42][6] -36
which is equivalent to
h{[x] x-42}
When a function is projected onto its first argument, the trailing semi-colons can be omitted. Given diff as above,
diff[42][6] 36
The brackets in functional evaluation are required in a projection, but can be eliminated in the projection's evaluation (as for any regular function),
diff[;42] 6 -36 diff[42] 6 36
Which notation to use is a matter of coding style.
Verb Projection
A binary verb can also be projected onto its left argument, although the notation may take some getting used to. For example, the projection of – onto its left argument is,
(42-)6 36
A verb cannot be projected onto its right argument, since this would lead to notational ambiguity. For example, (-42) is the atom -42 and not a projection,
(-42) -42
If you really want to project onto the right argument of an operator, you can do so by using the dyadic function form and juxtaposition of the argument,
-[;42] 98 56
In fact, the whitespace is not necessary in this example,
-[;42]98 56
We warned you about the notation.
Multiple Projections
When the original function has valence greater than two, it is possible to project onto multiple arguments simultaneously. For example, given,
f:{[x;y;z] x+y+z}
we can project f into its first and third arguments and end up with a monadic function,
f[1;;3][5] 9
We arrive at the same result by taking the projection f[1;;]—now a dyadic function—and projecting onto its second argument to arrive at f[1;;][;3],
f[1;;][;3][5] 9
This is equivalent to projecting in the reverse order,
f[;;3][1;][5] 9
If g is defined as a projection of f and the definition of f is changed, g remains the projection of the original f.
f:{[x;y] x-y} g:f[42;] g {[x;y] x-y}[42;] g[6] 36 f:{x;y] x+y} g[6] 36
This can be seen by displaying g on the console,
g {[x;y] x-y}[42;]
Lists and Functions as Maps
This section explores the deeper relationship between lists and functions; it can be skipped by the mathematically faint of heart.
Similarity of Notation
You have no doubt noticed that the notation for list indexing is identical to that for function evaluation. That is,
L:(0 1 4 9 16 25 36) f:{[x] x*x} L[2] 4 f[2] 4 L 5 25 f 5 25 L 3 6 9 16 f 3 6 9 16
This is not an accident. Recall that in Lists as Maps we saw that a list is a map defined by means of the implicit input-output correspondence given by item indexing. A function is a map defined by a sequence of expressions representing the algorithm used to obtain an output value from the input parameters. For consistency, the two different mechanisms for implementing a map should have the same notation. It may take a little time to get accustomed to the rationality of q.
Item-wise Extension of Atomic Functions
With the interpretation of lists and functions as maps, we can motivate the behavior of list indexing and function application when a simple index or atomic parameter is replaced by a simple list of the same. Specifically, we are referring to
L[2 5] 4 25 f[2 5] 4 25
in the previous examples. The expression enclosed in brackets is a simple list, call it I. Viewing the list I as a map, the two expressions are the composition of L and I, and the composition of f and I,
L[2 5] is (L[2]; L[5])[[BR]] f[2 5] is (f[2]; f[5])
For a general list L, function f and item index list I, the compositions are,
L◦I(i,,j,,) = L(i,,j,,)[[BR]] f◦I(i,,j,,) = f(i,,j,,)
Indexing at Depth and Positional Retrieval
Next, we show the deeper correspondence between list indexing and multivalent function evaluation. Notationally, a nested list is a list of lists, but it can also be viewed functionally as a compact form of the input-output relationship for a multivariate map. This mapping transforms tuples of integers onto the constituent atoms of the list and has valence equal to one plus the level of nesting of the list.
For example, a list with no nesting is a monadic map of integers to its atoms via item indexing.
L1:(1;2h;`three;"4") L1[3] "4"
A list with one level of nesting can be viewed as an irregular (or ragged) array by laying its rows out one above another. For example, the list L2 specified as
L2:((1b;2j;3.0);(4.0e;`five);("6";7;0x08;2000.01.10))
can be thought of an the ragged array. The show function does just this,
show L2 (1b;2j;3f) (4e;`five) ("6";7;0x08;2000.01.10)
This representation of a ragged array is a generalization of the I/O table for monadic maps. From this perspective, indexing at depth is a function whose output value is obtained by indexing into the ragged array via position. In other words, the output value L2[i;j] is the j^{th} element of the i^{th} row,
L2[1;0] 4.0e
This motivates the interpretation of L2 as dyadic map over a sub-domain of the two-dimensional Cartesian product of non-negative integers and with range equal to the atoms of L2. The tuple i,j is mapped positionally, analogous to simple item indexing.
Projection and Index Elision
You have also noticed that the notations of function projection and eliding indices in a list are identical. Revisiting the example of Elided Indices,
L :((1 2 3;4 5 6 7);(`a`b`c`d;`z`y`x`;`0`1`2);("now";"is";"the"))
Define the list L1 by eliding the first and last index, as
L1:L[;1;] L1 (4 5 6 7;`z`y`x`w`v;"is")
Viewing L as a map of valence three whose output value is obtained by indexing at depth, this makes L1 the projection of L onto its second argument. From this perspective, L1 is a dyadic map that retrieves values from a sub-list,
L1[1;2] `x
Out of Bounds Index
The previous discussion also motivates the explanation for the behavior of item indexing in case an "out of bounds" index is presented. In verbose languages, this would either result in some sort of error—the infamous indexing off the end of an array in C—or an exception in Java and C#.
By viewing a list as a function defined on a sub-domain of integers, it is reasonable to extend the domain of the function to all integers by assigning a null output value to any input not in the original domain. In this context, null should be thought of as "missing value." This is exactly what happens.
In the following examples, observe that the type of null returned matches the item type for simple lists and is 0N for a general list
L1:1 2 3 L1[-1] 0N L2:100.1 200.2 300.3 400.4 L2[100] 0n L3:"abcde" L3[-1] " " L4:1001101b L4[7] 0b L5:(1;`two;3.0e) L5[5] 0N
Creating Strings from Data
As mentioned earlier, q strings are simple lists of char, which play a role similar to strings in verbose languages. It is possible to convert data into strings, akin to the toString() method in O-O languages.
The string function can be applied to any q entity to produce a textual representation suitable for display or use in external contexts such as text editors, Excel, etc. In particular, the string result does not contain any q formatting information. Also, note that the result of string is always a list of char. Following are some examples,
string 42 "42" string 6*7 "42" string 42422424242j "42422424242" string `Zaphod "Zaphod"
Adverbs
Syntactically q has nouns, verbs and adverbs. Data entities such as atoms, lists, dictionaries and tables are nouns. Functions are also nouns. Primitive symbol operators and operations expressed in infix notation are verbs. For example, in the expression,
c:a+b
a, b and c are nouns, while : and + are verbs. On the other hand, in
c:+[a;b]
a, b, c and + are nouns, while : is a verb.
An adverb is an entity that modifies a verb or function to produce a new verb or function whose behavior is derived from the original.
The following adverbs in q are quite useful.
Symbol | Name |
' | each |
/: | each right |
\: | each left |
The character that represents each is the single quote (') which is distinct from the back tick (`) used with symbols.
each-both (')
Loosely speaking, the adverb each-both (') modifies a verb or function by applying its behavior item-wise to corresponding list elements. This concept is similar to the manner in which an atomic verb or function is extended to lists.
There cannot be any whitespace between ' and the verb it modifies.
Perhaps the most common example of each is join-each ( ,' ) which concatenates two lists item-wise. In its base form, join takes two lists and returns the result of the second appended to the first.
L1:1 2 3 4 L2:5 6 L1,L2 1 2 3 4 5 6
Two lists of the same count can be joined item-wise to from pairs.
L3:100 200 300 400 L1,'L3 (1 100;2 200;3 300;4 400)
As in the case of item-wise extension of atomic functions, the two arguments must be of the same length, or either can be an atom.
L1,'1000 (1 1000;2 1000;3 1000;4 1000) `One,'L1 ((`One;1);(`One;2);(`One;3);(`One;4)) "a" ,' "z" "az"
When both arguments of a derived function are atoms, the adverb has no effect.
3,'4 3 4
A useful example of join-each arises when both arguments are tables. Since a table is a list of records, it is possible to apply join-each to tables with the same count. The item-wise join of records results in a sideways join of the tables.
show t1 c1 -- 1 2 3 show t2 c2 -- a b c show t1,'t2 c1 c2 ----- 1 a 2 b 3 c
Monadic each
There is a form of each that applies to monadic functions and unary operators. The monadic each can be notated in two equivalent ways.
reverse each (1 2;`a`b`c;"xyz") (2 1;`c`b`a;"zyx") each[reverse] (1 2;`a`b`c;"xyz") (2 1;`c`b`a;"zyx")
The latter form underscores the fact that each transforms a function into a new function. It is arguably be more readable when the base operation is a projection.
(1#) each 1001 1002 1004 1003 (,1001;,1002;,1004;,1003) each[1#] 1001 1002 1004 1003 (,1001;,1002;,1004;,1003)
Observe that the result of the last example can also be obtained with enlist.
enlist each 1001 1002 1004 1003 (,1001;,1002;,1004;,1003) flip enlist 1001 1002 1004 1003 (,1001;,1002;,1004;,1003)
The last expression executes most quickly for long lists.
each-left (\:)
The each-left adverb (\:) modifies the base function so that it applies the entire second argument to each item of the first argument.
There cannot be any whitespace between \: and the verb it modifies.
To append a given string to every string in a list,
("Now";"is";"the";"time") ,\: ", " ("Now, ";"is, ";"the, ";"time, ")
each-right (/:)
The each-right adverb (/:) modifies the base function so that it applies the entire first argument to each item of the second argument.
There cannot be any whitespace between /: and the verb it modifies.
To prepend a given string to every string in a list,
" ,",/:("Now";"is";"the";"time") (" ,Now";" ,is";" ,the";" ,time")
Cartesian Product (,/:\:)
To achieve a Cartesian product of two lists, begin with join-right ,/: and modify it with each-left. The net effect is to join every item of the first argument with every element of the second argument.
L1:1 2 L2:`a`b`c L1,/:\:L2 (((1;`a);(1;`b);(1;`c));((2;`a);(2;`b);(2;`c)))
There is an extra level of nesting that can be eliminated with raze.
raze L1,/:\:L2 ((1;`a);(1;`b);(1;`c);(2;`a);(2;`b);(2;`c))
You can also begin with join-left ,\: and modify it with each-right.
raze L1,\:/:L2 ((1;`a);(2;`a);(1;`b);(2;`b);(1;`c);(2;`c))
Observe that the order of the resulting items for ,/:\: and that for ,\:/: is transposed.
Cartesian product is also encapsulated in the function cross.
L1 cross L2 ((1;`a);(1;`b);(1;`c);(2;`a);(2;`b);(2;`c))
Verb Forms of Indexing and Evaluation
We are familiar with the syntactic forms of indexing and function application using either square brackets or juxtaposition.
L:(1 2;3 4 5; 6) L[0] 1 2 L[0 2] (1 2;6) L 0 2 (1 2;6) L[1;2] 5 f:{x*x} f[0] 0 f[0 2] 0 4 f 0 2 0 4 g:{x+y} g[1;2] 3
There are equivalent verb forms for indexing and function application. The verb forms are called "index" or "apply," depending on the context.
@
The verb @ takes a list or a unary function as its left operand and a list of indices or a list of arguments as its right operand. For a list operand, @ returns the items as specified by the right operand—i.e., indexing at the top level. For a function operand, @ returns the result of applying the function to the arguments.
With L and f as above,
L@0 1 2 L@0 2 (1 2;6) f@0 0 f@0 4 0 16
The handling of a niladic functions with @ requires any scalar argument.
fn:{6*7} fn[] 42 fn@0 42
Dot (.)
The verb . takes a list or a multivalent function as its left operand and a list of indices or a list of arguments as its right operand. For a list left operand, . returns the result of indexing the list at depth as specified by the right operand. For a function left operand, . returns the result of applying the function to the arguments.
The verb . must be separated from both its operands by whitespace.
With L and g as above,
L . 1 2 5 g . 1 2 3
The verb . evaluates functions of any valence. This is useful when the function or arguments are supplied programmatically and the valence cannot be known beforehand.
The right argument of . must be a list.
f . 4 'type f . enlist 4 16f . 4
The handling of a niladic functions with . requires a singleton argument, which is arbitrary.
fn:{6*7} fn[] 42 fn . enlist 0 42
Functional Forms of Amend
Two functions of valence four are the foundation of q as a functional language. Each represents the result of applying a dyadic function to items of a list and a supplied parameter.
Dot (.)
The general form of functional . is
- .[L;I;f;y]
Here L is any list, I is a list of indices into L, f is a dyadic function and y is an atom. The result is the item-wise application, to the items of L indexed at depth by I, of f and the parameter y.
For example, to add 42 to an item in a sublist,
L:(100 200;300 400 500) I:1 2 .[L;I;+;42] (100 200;300 400 542)
To replace the same item,
.[L;I;:;42] (100 200;300 400 42)
Observe that the argument L is not modified,
L (100 200;300 400 500)
In order to change the list argument, it must be referenced by name.
L:(100 200;300 400 500) .[`L;I;:;42] / update L `L L (100 200;300 400 42)
Note that the result of functional amend with a reference by name is the name of the entity affected and not an error message.
The general form of functional . for a monadic function is,
- .[L;I;f]
Here L is any list, I is a list of indices into L, and f is a monadic function. The result is the item-wise application of f to the items of L indexed at the top level by I.
For example,
L:(100 200;300 400 500) I:1 2 .[L;I;neg] (100 200;300 400 -500)
Apply (@)
The general form of functional @ is
@[L;I;f;y]
Here L is any list, I is a list of indices into L, f is a dyadic function and y is an atom. The result is the item-wise application to the items of L, indexed at the top level by I, of f and the parameter y.
For example, to add 42 to items in a list,
L:100 200 300 400 I:1 2 @[L;I;+;42] 100 242 342 400
To replace these items,
@[L;I;:;42] 100 42 42 400
Observe that the argument L is unchanged,
L 100 200 300 400
In order to change the list argument, it must be referenced by name.
@[`L;I;:;42] / update L `L L 100 42 42 400
Note that the result of functional amend with a reference by name is the name of the entity affected and not an error message.
The general form of functional @ for a monadic function is,
@[L;I;f]
Here L is any list, I is a list of indices into L, and f is a monadic function. The result is the item-wise application of f to the items of L indexed at the top level by I.
For example,
L:101 102 103 I:0 2 @[L;I;neg] -101 102 -103
Prev: Primitive Operations, Next: Casting and Enumerations
©2006 Kx Systems, Inc. and Continuux LLC. All rights reserved.