Tables form the basis of a kdb+ database and are consequently one of the most important data structures in q. Tables I deals with creating and manipulating tables, including inserts and upserts. This tutorial is about queries and joins.
The simplest way of accessing data contained in a table is using a select statement, which looks something like this:
select <columns> from <table> where <condition> select <columns> by <columns> from <table> where <condition>
Using the table
family (from Tables I), here are some examples of select statements.
q)family name age hair eyes ------------------------ John 52 brown blue Mary 49 black brown David 18 blonde blue Jessica 13 black blue Avery 30 black green q)select from family where hair=`black name age hair eyes ----------------------- Mary 49 black brown Jessica 13 black blue Avery 30 black green q)select name, hair from family where age>30 name hair ---------- John brown Mary black
by clause allows you to perform aggregations or group the results based on a particular column or columns.
q)select avg age by hair from family hair | age ------| -------- black | 30.66667 blonde| 18 brown | 52
Select statements always return a table. It can be useful to return a list or dictionary instead – for this we use
q)exec name from family where eyes=`blue `John`David`Jessica q)exec name, eyes from family where hair=`black name| Mary Jessica Avery eyes| brown blue green
Notice that the result is a list if only one column is specified, but a dictionary (with column names as keys) when multiple columns are specified.
Updates and deletes
q)delete eyes from family name age hair ------------------ John 52 brown Mary 49 black David 18 blonde Jessica 13 black Avery 30 black q)update age:age+1 from family where name in `John`Mary name age hair eyes ------------------------ John 53 brown blue Mary 50 black brown David 18 blonde blue Jessica 13 black blue Avery 30 black green
family after these queries – what do you notice? The queries have not modified the table, but have simply returned a result.
To modify the table, we instead pass a reference to the table.
The expression is modified only very slightly: a backtick precedes the table name.
You might remember from the previous tutorial that this was also necessary when modifying a table using
q)update surname:`Quill from `family `family q)delete from `family where name in `David`Jessica `family
family look like now? It should have 3 rows and 4 columns!
Tip: Be very careful with deletes, you don't want accidentally to delete your entire in-memory database!
There are many ways of joining tables in q/kdb+. This tutorial deals with only some!
Let's say we've got the following tables.
q)students:( id:140 265 204 212 367 197 329 242; class:`green`blue`green`orange`green`blue`green`green) q)students id class ---------- 140 green 265 blue 204 green 212 orange 367 green 197 blue 329 green 242 green q)mentors:( class:`green`blue`violet; mentor_name:("Julia Johnson";"Teddy Rowles";"Gerald Carlier")) q)mentors class mentor_name ----------------------- green "Julia Johnson" blue "Teddy Rowles" violet "Gerald Carlier"
The simplest kind of join just pastes the rows of one table onto the end of the other. If the schemas (lists of column names) are identical, we join them using a simple comma. Try typing out each of the following lines.
The second line fails, as the two tables do not have matching schemas.
Left join and inner join each take two tables as inputs: the second table must be keyed, and the first table must contain those keyed columns.
<table> lj <keyed-table> <table> ij <keyed-table>
We can join
mentors on the
class column. Type out the following lines and compare the results.
students lj 1!mentors students ij 1!mentors
The left join contains all of the rows from
students, whereas the inner join contains only those rows matching the class column in
Finally, to join the two tables without matching rows, we use the union join:
q)students uj mentors id class mentor_name --------------------------- 140 green "" 265 blue "" 204 green "" 212 orange "" 367 green "" 197 blue "" 329 green "" 242 green "" green "Julia Johnson" blue "Teddy Rowles" violet "Gerald Carlier"
For the union join either neither table will be keyed, or the tables will have the same keys.
See Reference/joins for more information on these and other joins.
1. Create a table
cities with columns
state, keyed to
2. Create an unkeyed table
sales with columns
As an alternative to typing in the raw content of
sales, see if you can construct it using
rand and the
city column of
3. Join the tables to show the state for each sale. Your result will be an unkeyed table with columns
4. Report the total sales, by state. Your result will be a table with columns
sale, keyed to
state, with a row for each unique state in the result from (3).