QforMortals3/Dictionaries and Tables 101
1.15 Dictionaries and Tables 101
After lists, the second basic data structure of q is the dictionary, which models key-value association. A dictionary is constructed from two lists of the same length using the (!) operator. The left operand is the list of (presumably unique, though unenforced) keys and the right operand is the list of values. A dictionary is a first class value, just like an integer or list and can be assigned to a variable.
q)`a`b`c!10 20 30 a| 10 b| 20 c| 30 q) q)d:`a`b`c!10 20 30
Observe that dictionary console display looks like the I/O table of a mathematical mapping. No coincidence.
Given a key, we retrieve the associated value with the same square bracket notation as list indexing.
A useful class of dictionary has as keys a simple list of symbols and as values a list of lists of uniform length. We think of such a dictionary as a named collection of columns and call it a column dictionary.
q)`c1`c2!(10 20 30; 1.1 2.2 3.3) c1|10 20 30 c2| 1.1 2.2 3.3 q) q)dc:`c1`c2!(10 20 30; 1.1 2.2 3.3) ￼
Retrieving by key yields the associated column, which is itself a list and so can be indexed.
q)dc[`c1] 10 20 30 q)dc[`c1] 10 q)dc[`c2] _
Whenever such iterated indexing of nested entities arises in q, there is an equivalent syntactic form, called indexing at depth, to make things a bit more readable.
q)dc[`c1] 10 q)dc[`c1; 0] 10 q)dc[`c1; 1] _ q)dc[`c1; 2] _
Indexing at depth notation suggests thinking of dc as a two- dimensional entity; this is reasonable in view of its display above. Let’s pursue this. Whenever an index is elided in q, the result is as if every legitimate value had been specified in the omitted index position. For a column dictionary, this yields the associated column when the second slot is omitted.
q)dc[`c1;] 10 20 30 q)dc[`c2;] _
Things are more interesting when the index in the first slot is elided. The result is a dictionary comprising a section of the original columns in just the specified position. ￼
q)dc[;0] c1| 10 c2| 1.1 q)dc[;1] _ q)dc[;2] _
To summarize, we have an entity that retrieves columns in the first slot and section dictionaries in the second slot. The issue is that columns are conventionally accessed in the second slot of two- dimensional things. No problem. We apply the built-in operator (flip) (better called “transpose”) to reverse the order of indexing. We still have the same column dictionary but slot retrieval is reversed: columns are accessed in the second slot and section dictionaries are retrieved from the first slot.
q)t:flip `c1`c2!(10 20 30; 1.1 2.2 3.3) q)t[0; `c1] 10 q)t[1; `c1] _ q)t[2; `c1] _ q)t[0; `c2] _ q)t[; `c1] 10 20 30 q)t[0;] c1| 10 c2| 1.1
We emphasize that the data is still stored as a column dictionary under the covers; only the indexing slots are affected. Observe that the console display of a flipped column dictionary is indeed the transpose of the column dictionary display and in fact looks like ... a table.
q)flip `c1`c2!(10 20 30; 1.1 2.2 3.3) c1 c2 ------ 10 1.1 20 2.2 30 3.3
A flipped column dictionary, called a table, is a first class entity in q. In the table setting, the section dictionaries are called records of the table. They correspond to the rows of SQL tables. To see why, observe that the record at index 0 is effectively the horizontal slice of the table in “row” 0. Let’s reexamine record retrieval, this time omitting the optional trailing semicolon from the elided second index.
q)t c1| 10 c2| 1.1 q)t _ q)t _
Looking at this syntactically, we might conclude that t is a list of record dictionaries. In fact it is, at least logically; physically a table is always stored as a collection of named columns.
Thus we have arrived at:
- A table is a flipped column dictionary.
- It is also a list of record dictionaries.
While we can always construct a table as a flipped column dictionary, there is a convenient syntax that puts the names together with the columns. The notation looks a bit odd at first but it will seem more reasonable when we encounter keyed tables later.
q)( c1:10 20 30; c2:1.1 2.2 3.3) c1 c2 ------ 10 1.1 20 2.2 30 3.3
A few notes.
- The square brackets are necessary to differentiate a table from a list
- The occurrence of ‘:’ is not assignment. It is merely a syntactic marker separating the name from the column values
- The column names in table definition are not symbols, although they are converted to symbols under the covers.
Reprinted with the author's permission from: q for Mortals Version 3, An Introduction to Q Programming by Jeffry A. Borror.
©2015 Jeffry A. Borror/ q4m LLC