QforMortals/primitive operations

From Kx Wiki
Jump to: navigation, search

Contents

Primitive Operations

Introduction to Functions

Function Notation

Operators and functions are closely related. In fact, operators are just functions used with infix notation. We shall cover functions in depth later, but we provide a brief overview here. Function evaluation in q uses square brackets to enclose the arguments and semicolons to separate them. Thus the output value of a monadic function f for the input x is written,

	f[x]

Similarly, the value of a dyadic function is written,

	f[x;y]

The simplest functions are those whose domain and range are atomic data types. These function are called (what else?) atomic functions.

Primitives, Verbs and Functional Notation

The normal way of writing addition in mathematics and most programming languages uses an operator with infix notation—that is, a plus symbol between the two operands,

	2+3

We can also think of addition as a dyadic function that takes two numeric arguments and returns a numeric result. You probably wouldn't think twice at seeing,

	sum[a;b]

But you might blink at the following perfectly logical equivalent,

	+[a;b]

A dyadic function that is written with infix notation is called a verb. This terminology arises from thinking of the left operand as the subject which acts on the right operand as object.

The primitive operators are the built-in atomic verbs, including the basic arithmetic, relation and comparison operators. Some are represented by a single ASCII symbol such as '+', '-', '=', and '<'. Others use compound symbols, such as '<=', '>=', and '<>'. Still others have names such as 'not', 'neg'. The extent of operations is not limited to the primitives, since any monadic or dyadic function can be made into a verb.

Any verb, including all the primitive operators, can also use regular function notation. So, in q you can indeed write,

	+[2;3]
5

It is even possible, and sometimes useful, to write a binary verb using a combination of infix and functional notation for the two operands. This may look very strange at first,

	(2+)[3]
5

Item-wise Extension of Atomic Functions

A fundamental feature of an atomic function or operator is that its domain is extended to lists by item-wise application. Thus, a monadic atomic function is applied to a simple list by operating element-wise on the list. A dyadic atomic operator is extended to operate on an atom and a simple list by applying its operation to the atom and the items in each position of the list. Similarly, a dyadic atomic operator is extended to operate on a pair of simple lists by operating pair-wise on elements in corresponding positions.

Symbolically, let m be a unary atomic verb, op a binary atomic verb, a an atom, L, L1 and L2 simple lists, and i an int index. Then

ith element of is
m[L] m[L[i]]
a op L a op L[i]
L op a L[i] op a
L1 op L2 L1[i] op L2[i]

For example, the result of applying neg to a simple list is obtained by applying neg to each item of the list,

	L:100 200 300 400
	neg L
-100 -200 -300 -400

The result of adding an atom to a simple list is obtained by adding the atom to each item of the list,

	99+L
199 299 399 499

The result of adding two simple lists of the same length is addition of items at corresponding positions,

	L1:100 200 300 400
	L2:9 8 7 6
	L1+L2
109 208 307 406

Operator Precedence

Traditional Operator Precedence

Recall that mathematical notation and verbose programming languages have a concept of operator precedence, which attempts to resolve ambiguities in the evaluation of arithmetic and logical operations in expressions. The arithmetic precedence rules were drummed into you in elementary school: multiplication and division are equal and come before addition and subtraction, etc. There are similar precedence rules for =, <, >, 'and' and 'or'.

Left of Right Evaluation

Although the traditional notion of operator precedence has the weight of many years of incumbency (not to mention the imprecations of your fifth grade math teacher), it’s time to throw the bum out. As mentioned in the q philosophy, q has no rules for operator precedence. Instead, it has one simple rule for evaluating any expression:

Expressions are evaluated left of right

We could also say "right to left" since the interpreter evaluates an expression from right-to-left. However, every action in q is essentially a function evaluation, and it is more natural to read "f of x" rather than "x evaluated by f." Thinking functionally makes "of" a paradigm not just a preposition.

The adoption of left-of-right expression evaluation frees q to treat infix notation simply and uniformly. Which notation is used, infix or functional, depends on what is clearer in the specific context.

Left-of-right expression evaluation also means that there is no ambiguity in any expression. (This is from the compiler's perspective; it is certainly possible to write q expressions comprehensible to only the compiler and q gods) Parentheses can still be used to override the default evaluation order but there will be far fewer once you abandon the old (bad) habit of using them to override operator precedence. You should arrange your expressions with a goal of placing parentheses on the endangered species list.

A Gotcha of Left of Right Evaluation

Due to left-of-right evaluation, one situation where parentheses are needed is to isolate the result of an expression that is the left operand of a verb. Omitting such parentheses is a common error for q newbies, as this grouping is often unnecessary in verbose languages.

Here is a canonical example, where < and > have their usual meanings. As we shall see shortly, the | operator returns the maximum of its operands; this reduces to "or" for binary types. It is a rite of passage of every q newbie to write the first expression intending the second,

	x:100
	x<42|x>98
0b
	(x<42)|x>98
1b

The first expression parses from right to left as:

x is tested against 98 by greater than, yielding 1b, which is compared for
the larger to 42, yielding 42, against which x is tested by less than,
yielding 0b.

The second expression parses from right to left as:

x is tested against 98 by greater than, yielding 1b, which is compared for
the larger to 0b (being the result of testing x against 42 by less than),
yielding 1b.

Should this seem unnatural, don't worry. Once you complete the next chapter, revisit here and it will be second nature.

Rationale for No Operator Precedence

Operator precedence is quite feeble in that it requires an entire expression to be analyzed (i.e., parsed) before it can be evaluated. Ironically, it results in the frequent use of parentheses to override the very rules that are purportedly there to help.

Even more damning is that operator precedence forces semantic content onto infix notation. Suppose a programming language wished to allow dyadic functions to be verbs—i.e., expressed in infix notation—so that

	f[x;y]

can also be written

	x f y

This would entail the extension of precedence rules to cover verbs whenever they are mixed with arithmetic operations. Aside from being silly, this would result in yet more parentheses.

Match (~)

The non-atomic, binary match operator (~) applies to any two entities, returning a boolean result of 1b if they are identical and 0b otherwise. For two entities to match, they must have the same shape, the same type and the same value(s), but they may occupy separate storage locations. Colloquially, clones are considered identical in q because they are indistinguishable.

Information.png This differs from the notion of identity in some verbose languages, in that distinct q entities can be identical. For example, in languages of C ancestry, objects are equal if and only if their underlying pointers address the same memory location. Identical twins are not equal. You must write your own equivalence method to determine if one object is a deep copy of another.

There are no restrictions as to the type or shape of the two operands for match. Try to predict each of the following results of match,

	42~42
1b
	42~42h
0b
	42f~42.0
1b
	42~`42
0b
	`42~"42"
0b
	4 2~2 4
0b
	42~(4 2;(1 0))
0b
	(4 2)~(4; 2*1)
1b
	(1 2; 3 4)~(1; 2 3 4)
0b

While you are learning q, applying match can be an effective way to determine if you have entered what you intended, or to discover whether two different ways of expressing something produce the same result. For example, q newbies often trip over

	42~(42)
1b

This technique can be useful in checking intermediate results when debugging (except for the q gods who enter perfect q code every time).

Relational Operators

The relational operators are atomic verbs that return a boolean result. Relational operations on atomic types have requirements regarding the compatibility of the operands.

Equality (=) and Inequality (<>)

We begin with the equality operator (=) which differs from match in that it is atomic, so it tests its operands component-wise instead of as a whole. All atoms of numeric or character type are mutually compatible for equality, but symbols are compatible only with symbols. Equality is not strict with regard to type, meaning types with the same underlying value are equal. For example, chars are equal to their underlying values.

	42h=2*21
1b
	42=42.0
1b
	42=(42)
1b
	42=0x42
0b
	42="*"
1b

A symbol and a character are not compatible and an error results from the test,

	`a="a"
'type

The not-equal primitive is (<>).

	42<>0x42
1b
Information.png {{{1}}}
	a:42
	b:98.6
	a<>b
1b
	not a=b
1b
Information.png When comparing floats, q uses multiplicative tolerance, which makes arithmetic give more rational results.
	r:1%3  r
0.3333333333333333
	2=r+r+r+r+r+r
1b

Not Zero (not)

The monadic, atomic relational operator not differs from its equivalent in some verbose languages. It returns a boolean result and has a domain of all numeric and character types; it is not defined for symbols. The not operator generalizes the reversal of true and false values by testing its argument against an underlying value 0. In other words, it answers the Hamletonian question: to be, or not to be, zero.

The test against zero yields the expected results for boolean arguments

	not 0b
1b
	not 1b
0b

More generally, the test against zero apples for any numeric type

	not 42
0b
	not 0
1b
	not 0j
1b
	not 0xff
0b
	f:98.6
	not f
0b
	not 0.0
1b

For char, not returns false except for the character representing the underlying value of 0.

	not "a"
0b
	not " "
0b
	not "\000"
1b

For date and datetime values, not tests against midnight of Jan 1, 2000, since this is the date with underlying value 0.

	not 2042.04.02
0b
	not 2000.01.01T00:00:00:000
1b
	not 2000.01
1b

The last example obtains because omitted temporal constituents default to their underlying numeric 0 values.

For time values, not tests against 00:00:00:000,

	not 00:00:00:000
1b
	not 04:02:42:042
0b

Ordering: <, <=, >, >=

We consider the binary atomic order operators. Less than (<), greater than (>) less or equal (<=) and greater or equal (>=) are defined for all atoms with the requirement that the operands be of compatible types. Numeric and char types are mutually compatible, but symbols are only compatible with symbols. Comparison for numeric and char types is based on underlying numeric value, independent of type.

	4<42
1b
	4h>=0x2a
0b
	-1.59e<=99j
1b

For char atoms, the underlying numeric value results in comparison according to ASCII character sequence.

	"A"<"Z"
1b
	"a"<="Z"
0b
	"A"<"0"
0b
	"?"<"/"
0b

A numeric atom and a char are compared according to the underlying numeric value of the char.

	42<"z"
1b

For symbols, comparison is based on lexicographic order.

	`a>=`b
0b
	`ab<`abc
1b

Now that we are familiar with relational operations on atoms, let’s check out their item-wise extensions to simple lists.

	not 0110101100b
1001010011b
	2<1 2 3
001b
	1 2 3h>=-987.65 1.234 567.89
110b
	" "="Life the Universe and Everything"
00001000100000000100010000000000b
	"zaphod"="Arthur"
000100b
	"zaphod">"Arthur"
100000b
Information.png As of this writing (Oct 2006), the primitive > is apparently converted to the equivalent < under the covers by the q interpreter. That is,
	a  >  b

is actually evaluated as

	b  <  a

This does not matter when a and b are atoms or lists, but it does have consequences when they are dictionaries.

Basic Arithmetic: +, -, *, %

The arithmetic operators are atomic verbs and come in two flavors: binary (in the mathematical sense of having two operands) and unary (one operand). We begin with the four operations of elementary arithmetic.

Symbol Name Example
add 42+67
* times 2h*3h
% divide 42%6

On the surface, things look pretty much like other programming languages, except that division is represented by % since / is used for delimiting comments. We have,

	6*7
42
	a:42
	b:3
	c:a-b
	c
39
	100*a
4200
	c%b
13f
Information.png The result of division is always a float.

For a programmer not used to left-of-right evaluation, the following may take some getting used to,

	2*1+1
4

Things can get funky fast for the q newbie,

	c:1000*b:1+a:42
	c
43000

One way to read this is:

The integer value 42 is assigned to the variable named a, then the assigned
value is added to 1, then this result is assigned to the variable named b,
whose assigned value is multiplied by 1000 and the result is assigned to the
variable named c.

The arithmetic operations are defined for all numeric types, and all numeric types are compatible. The type of the result depends on the operands. Loosely speaking, smaller types are promoted to their wider cousins and division always results in floats. Typing does not get in the way of arithmetic.

When binary types participate in addition, subtraction and multiplication, they are promoted to int. In other words, arithmetic is not performed modulo 2 (i.e., in base 2) for binary values or modulo 256 for byte values.

	1b+1b
2
	0x2a+0x11
59
	42+1b
43
	5*0x2a
210

When integer types are mixed in addition, subtraction and multiplication, the result is an int or the widest type present, whichever is wider.

	a:42
	b:123h
	c:1234567890j
	a+b
165
	a+b+c
12345678055j

The result of addition, subtraction and multiplication of integer data types is modulo the width of the result. That is, overflow is ignored. For example, int arithmetic is modulo 2^32^.

	i:2147482646
	i+i
4

When any numeric types participate in division they are promoted to float and the result is a float.

	1%3
0.3333333
	3%1
3f

When floating point data types are mixed, the result is float.

	6.0+7.0e
42.0
Information.png The arithmetic operators are always dyadic. In particular, while (-) is also used syntactically to denote a negative number, there is no unary function (-) to negate a value. Its attempted use for such generates an error. Use the function neg for this purpose.
	a:-4
	a
-4
	-a                / This is an error
'-
	neg a
4

Arithmetic operators are extended item-wise to lists. Thus,

	2+100 200 300
102 202 302
	b:1000.0 2000.0 3000.0 4000.0
	b*2
2000 4000 6000 8000f
	c:2 4 6 8
	b%c
500 500 500 500f

In the following example, observe that item-wise atomic application is recursive when all the list components are numeric,

	e:(100 200;1000 2000)
	e-2
(98 198;998 1998)

Max (|) and Min (&)

The comparison operators are atomic and binary, and return the type of the widest operand. Numeric types and char are mutually compatible; comparison is not defined for symbols.

The max operation (|) returns the maximum of its operands based on underlying numeric values; this reduces to logical "or" for binary operands. The min operation (&) returns the minimum of its operands based on underlying numeric values; this reduces to logical "and" for binary operands. The same type promotion rules apply as for the arithmetic operators,

	0b|1b
1b
	1b&0b
0b
	42|0x2b
43
	4.2e&42j
4.2e
	"a"|"z"
"z"
	"0"&"A"
"0"
	`a|`z
`type

Following are examples of comparison extended item-wise to simple lists,

	2|0 1 2 3 4
2 2 2 3 4
	11010101b&01100101b
01000101b
	"zaphod"|"arthur"
"zrthur"
Information.png For the symbolically challenged, the operator </blockquote>
	1 and 3
1
	"a" or "z"
"z"

Exponential Primitives: sqrt, exp, log, xexp, xlog

sqrt

The atomic unary <tt>sqrt has as domain all non-negative numeric values and returns a float representing the square root of its argument

	sqrt 2
1.414214
	sqrt 4
2f
	sqrt 0x42
8.124038
	sqrt -1
0n

exp

The atomic unary exp has as domain all numeric values and returns a float representing the base e raised to the power of its argument.

	exp 1
2.718282
	exp 4.2
66.68633
	exp -12h
6.144212e-006

log

The atomic unary log has as domain all numeric values and returns a float representing the natural logarithm of its argument.

	log 1
0f
	log 0x2a
3.73767
	log 0.0001
-9.21034
	log -1
0n

xexp

The atomic binary xexp has as domain all numeric values in both operands and returns a float representing the left operand raised to the power of the right operand. If the mathematical operation does not make sense, the result is 0n.

	2 xexp 5
32f
 	-2 xexp .5
0n

xlog

The atomic binary xlog has as domain all numeric values in both operands and returns a float representing the logarithm of the right operand with respect to the base of the left operand. If the mathematical operation does not make sense, the result is 0n.

	2 xlog 32
5f
 	2 xlog -1
0n

More Mathematical Primitives: mod, signum, reciprocal, floor, ceiling and abs

These functions are useful in calculations.

Modulus (mod)

The binary mod is atomic in its left operand (dividend) which is any numeric value. The right operand (divisor) is a numeric atom. The result is the remainder of dividing the dividend by the divisor. This produces the usual remainder from elementary school for positive integers but is somewhat more complex for general numeric arguments.

For a positive divisor, the remainder is defined as the difference between the dividend and the largest integral multiple of the divisor not exceeding the absolute value of the dividend. For a negative divisor, the result is -1 times the result obtained with the sign reversed on both arguments. The type of the result is int if the operands are of integer type and float otherwise.

	4 mod 3
1
	0x2a mod 0x10
10
	4.5 mod 2.3
2.2
	-4.5 mod 2.3
0.1
	-4.5 mod -2.3
-2.2
	4.5 mod -2.3
-0.1

Sign (signum)

The atomic unary signum has as domain all integral and floating point types and returns an int representing the sign of its argument. Here 1 represents "positive", -1 represents "negative" and 0 represents a zero argument.

	signum 4.2
1
	signum -42
-1
	signum 0
0

reciprocal

The atomic unary reciprocal has as domain all numeric types and returns a float representing 1.0 divided by the argument.

	reciprocal 0.02380952
42.00001
	reciprocal 0
0w

floor

The atomic unary floor has as domain int, long and floating point types and returns an int representing the largest integer that is less than or equal to its argument.

	floor 4
4
	floor 4.0
4
	floor 4.2
4
	floor -4.0
-4
	floor -4.2
-5

The floor operator can be used to truncate or round floating point values to a specific number of digits to the right of the decimal.

	a:4.242
	0.01*floor 100*a
4.24
	0.1*floor 0.5+10*a
4.2
Information.png The floor function does not apply to boolean, byte or short types.
	floor 0x2a
'type

ceiling

Analogous to floor, the atomic unary ceiling has as domain int, long and floating point types and returns the smallest int that is greater than or equal to its argument,

	ceiling 4
4
	ceiling 4.0
4
	ceiling 4.2
5
	ceiling -4.0
-4
	ceiling -4.2
-4
Information.png For some reason, ceiling does apply to boolean or byte types but not to short types.
	ceiling 0b
0
	ceiling 42h
'type

Absolute Value (abs)

The atomic unary abs has as domain all integral and floating point types. It returns its argument if the argument is greater than or equal to zero, or neg applied to its argument otherwise. The result of abs has the same type as the argument.

	abs 4
4
	abs -4
4
	abs -4.2
4.2
	abs -4.0
4f
	abs -4.2e
4.2e
	abs -4j
4j

Operations on Temporal Values

We have separated temporal types and their operations into this section because they have richer semantics.

Internal Format of Temporal Types

First, we note that a date or datetime is actually stored under the covers as a signed float, with 0 corresponding to midnight of January 1, 2000. So,

	0.0=2000.01.01T00:00:00:000
1b

The integral part of the floating point value corresponds to the number of days after (positive) or before (negative) the start of the millennium. The decimal portion of a datetime is the fractional portion of a 24-hour day represented by its time component. Thus,

	33.5=2000.02.03T12:00:00.000
1b

Time is stored as the number of milliseconds from the start of day. Thus, a time value is between 0 and 86,400,000 (24*60*60*1000). So,

	43200000=12:00:00.000
1b

Basic Operations

Any expression involving temporal types and numerical types that should make sense actually does, and it works in the expected fashion. Comparison of dates or datetimes reduces to comparison of the underlying floating point values. Thus,

	2006.01.01.T00:00:00.000<2005.12.25T12:00:00.000
0b
	2005.12.25=2005.12.25T00:00:00.000
1b
	2005.12.25<2005.12.25T12:00:00.000
1b

Time values can be compared with each other and the result is based on the underlying millisecond counts.

	12:01:10.987<17.05.42.986
1b

A date and a time can be added to give a datetime.

	2007.07.04+12:45:59.876
2007.07.04T12:45:59.876

Note that a time is implicitly converted to a fractional day when it is added to a date to get a datetime.

Mixing Temporal Types with Day Counts and Time Counts

A date or datetime can be compared, or tested for equality, with a float,

	366.0=2001.01.01
1b

A time can be compared with an int.

	4320000000<12:00:00.001
1b

A float representing a fractional day count can be added to or subtracted from a datetime (or date) to give a datetime. In this context, the integral part of the fractional day count represents the number of days and the decimal part represents the fractional part of a 24-hour day. For example, to move forward 33 days and 12 hours,

	2000.01.01T00:00:00:000+33.5
2000.02.03T12:00:00.000

Or, to move back 2 hours and 30 minutes,

	2000.01.01T00:00:00:000-2.5%24
1999.12.31T21:30:00.000

An int representing a day count can be added to or subtracted from a date to give a date.

	2006.07.04+5
2006.07.09

The difference of two datetimes is a float representing the fractional day count between them.

	2007.02.03T12:00:00.000-2007.01.01T00:00:00:000
33.5

The difference between two dates is an int day count representing the number of days between them.

	2006.07.04-2006.04.04
91

An int representing a time count of milliseconds can be added to or subtracted from a time to give a time.

	12:00:00.000+1000
12:00:01.000

The difference between two times is an int count of the number of milliseconds between them.

	23:59:59.999-00:00:00.000
86399999

Observe that a time does not wrap when it exceeds 24 hours,

	23:59:59.999+2
24:00:00.001

Operations on Infinities and Nulls

As you gain experience with the way q handles infinities and nulls, you’ll find that it is simpler and more rational than verbose languages. Injection of such an exceptional value into a calculation stream propagates through subsequent steps in a predictable way without the need for special error trapping and handling. While the end result will contain some meaningless data, portions that do not depend on the invalid values will still compute correctly.

Producing Infinities

We show how to produce and operate with the infinities we met earlier.

	4%0
0w
	3.14%0.0
0w
	0x32%0
0w
	1b%0
0w

Similarly, division of a negative numeric value by 0 results in negative infinity, denoted by -0w.

	-4%0.0
-0w
	-3.14%0
-0w

Producing NaN

When any numeric zero is divided by zero, the mathematical result is undefined. This is sometimes represented in writing as NaN ("not-a-number"). It is denoted in q by 0n, which is also the float null value,

	0%0
0n
	0.0%0.0
0n
	0.0e%0b
0n
	0j%0x00
0n

Basic Arithmetic on Infinities and Nulls

The infinities and nulls act reasonably in numeric expressions and comparisons. As you would expect, if one member of an expression is infinite or null so is the result. In an arithmetic mix of infinity, null or NaN, null prevails over infinity and NaN prevails over other nulls. Note that the signs of infinities are carried correctly through arithmetic and meaningless expression involving infinities result in NaN.

	2+0w-3
0w
	0w*-0w
-0w
	-0w+0w
0n
	42+0n
0n
	42+0N
0N
	0w+0n
0n
	0n+0N
0n

Type Promotion

Observe that when nulls occur in expressions of mixed type, the same type promotion rules apply as for finite values.

	42+0N
0N
	42j+0N
0Nj
	0N+0Nj
0Nj
	0n+0N
0n

Equality

Infinities are distinct from all numeric values and from all nulls as well, since they do not represent missing data. All nulls are equal since they differ only by type.

	42=0w                / can compare a numeric value to infinity
0b
	0w=42%0              / can compare infinity to itself
1b
	0=0N                 / 0 is not the same as undefined integer
0b
	0=0n                 / 0 is not the same as undefined float
0b
	0w=0N                / infinity is not the same as undefined integer
0b
	0w=0n                / infinity is not the same as undefined float
0b
	0Nj=0N               / undefined long and undefined int are the same
1b
	0N=0n                / undefined int and undefined float are the same
1b
Information.png In contrast to some languages, such as C, separate !NaNs are equal.
	(0%0)=0%0
1b

Match

Match is a different story because type matters,

	42~0w                / can try to match a numeric value to infinity
0b
	0w~42%0              / can match infinity to itself
1b
	0~0N                 / 0 does not match an undefined integer
0b
	0~0n                 / 0 does not match undefined float
0b
	0w~0N                / infinity does not match undefined integer
0b
	0w~0n                / infinity does not match undefined float
0b
	0Nj~0N               / undefined long and undefined int do not match
0b
	0N~0n                / undefined int and undefined float do not match
0b

not

The not operator returns 0b for all infinities and nulls since they all fail the test of equality with 0.

	not 0w
0b
	not 0N
0b
	not 0n
0b

neg

The neg operator returns -1 times its operand, so it reverses the sign on infinities but does nothing to nulls since sign is meaningless for missing data.

	neg -0w
0w
	neg 0N
0N
	not " "
0b

Comparison

Comparisons also apply to infinities and nulls. Positive infinity is greater than any numeric value, negative infinity and any null. Negative infinity is less than any numeric value and positive infinity. Any null is less than both infinities and all numeric values.

	42<0w
1b
	-0w<42
1b
	-0w<1901.01.01
1b
	-0w<0w
1b
	-10000000<0N
0b
	0Nj<42
1b
	0n<-0w
1b

The null symbol is less than any other symbol

	`a<`                / the right side is the null symbol
0b

Max and Min

Finally, the behavior of | and & with infinities and nulls derives from that of equality and comparison.

	42|0w
0w
	-42&0N
0N
	0w|0n
0w
	-0w&0n
0n
	0n|0N
0n
	0n&0n
0n

Prev: Lists, Next: Functions

©2006 Kx Systems, Inc. and Continuux LLC. All rights reserved.
Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox