How to work with tables
This page introduces tables in q and demonstrates how to create and manipulate them efficiently.
Tables are first-class objects in q, meaning they are data structures that live in memory just like lists or dictionaries.
A table can be viewed from two perspectives:
- A list of rows/records: where each record is a dictionary with identical keys.
- A list of columns: where each column is a named list of equal length.
Internally, q implements a table as a flipped dictionary. Because operations on columns are vectorized, this structure allows for extreme performance when performing calculations across millions of records. A downside of this design is that row deletion is relatively expensive, as all the column lists must be compacted to close the resulting gap.
The column elements are called items or fields.
Creating a table
There are three primary ways to initialize a table in memory.
1. From a list of dictionaries
A list of conforming dictionaries (sharing the same keys) is automatically interpreted as a table.
q
q)d1: ([name: `Alice; phone: "555-0101"; age: 23])
q)d2: ([name: `Bob; phone: "555-0723"; age: 32])
q)d3: ([name: `Mike; phone: "555-6666"; age: 22])
q)(d1; d2; d3)
name phone age
--------------------
Alice "555-0101" 23
Bob "555-0723" 32
Mike "555-6666" 22
pandas
>>> d1 = {'name': 'Alice', 'phone': '555-0101', 'age': 23}
>>> d2 = {'name': 'Bob', 'phone': '555-0723', 'age': 32}
>>> d3 = {'name': 'Mike', 'phone': '555-6666', 'age': 22}
>>> pd.DataFrame([d1, d2, d3])
name phone age
0 Alice 555-0101 23
1 Bob 555-0723 32
2 Mike 555-6666 22
2. By flipping a column dictionary
Since a table is a flipped dictionary, you can create one by applying flip to a dictionary of lists of the same size:
q
q)flip ([name: `Alice`Bob`Mike; phone: ("555-0101"; "555-0723"; "555-6666"); age: 23 32 22])
name phone age
--------------------
Alice "555-0101" 23
Bob "555-0723" 32
Mike "555-6666" 22
pandas
>>> pd.DataFrame({
... 'name': ['Alice', 'Bob', 'Mike'],
... 'phone': ['555-0101', '555-0723', '555-6666'],
... 'age': [23, 32, 22]
... })
name phone age
0 Alice 555-0101 23
1 Bob 555-0723 32
2 Mike 555-6666 22
3. Table definition syntax (Recommended)
This is the most concise and idiomatic way to define a table. The [] brackets denote an unkeyed (simple) table.
q
q)([] name: `Alice`Bob`Mike; age: 23 32 22; city:`Dublin)
name age city
----------------
Alice 23 Dublin
Bob 32 Dublin
Mike 22 Dublin
pandas
>>> pd.DataFrame({
... 'name': ['Alice', 'Bob', 'Mike'],
... 'age': [23, 32, 22],
... 'city': 'Dublin'
... })
name age city
0 Alice 23 Dublin
1 Bob 32 Dublin
2 Mike 22 Dublin
Note
If a column is provided as an atom (like city:`Dublin above), q automatically extends it to match the length of the other columns.
Implicit column names
Table definition also makes it convenient to create a table from variables.
q)name: `Alice`Bob`Mike
q)phone: ("555-0101"; "555-0723"; "555-6666")
q)age: 23 32 22
q)([] name; phone; age)
name phone age
--------------------
Alice "555-0101" 23
Bob "555-0723" 32
Mike "555-6666" 22
Default column names are assigned when column names are omitted.
q)([] 1 2; `a`b)
x x1
----
1 a
2 b
Empty tables
Empty tables can be created from empty lists:
q)([] name: `$(); phone: (); age: `long$())
name phone age
--------------
It is recommended to specify the type of each column in an empty table. You can use the each-left iterator and type characters to avoid repeating empty list specifications for each column:
q
q)flip `name`phone`age!"s*j"$\:()
name phone age
--------------
pandas
>>> pd.DataFrame({
... 'name': pd.Series(dtype='str'),
... 'phone': pd.Series(dtype='object'),
... 'age': pd.Series(dtype='int64')
... })
Empty DataFrame
Columns: [name, phone, age]
Index: []
For wider tables, it is more convenient to define a schema dictionary:
q
q)schema: ([name:"s"; phone:"*"; age: "j"])
q)flip schema$\:()
name phone age
--------------
pandas
>>> schema = {'name': 'str', 'phone': 'object', 'age': 'int64'}
>>> pd.DataFrame({
... col: pd.Series(dtype=dt)
... for col, dt in schema.items()
... })
Empty DataFrame
Columns: [name, phone, age]
Index: []
Table metadata
Understanding the structure of your data is critical for debugging and optimization.
- cols: Returns the column names as a list of symbols.
- meta: Returns a keyed table describing column types, foreign keys, and attributes.
- tables: Lists all tables defined in the namespace provided.
q
q)meta t
c | t f a
-----| -----
name | s
phone| C
age | j
pandas
>>> t.dtypes
name object
phone object
age int64
dtype: object
| Column | Description |
|---|---|
| c | column names |
| t | data type character (lowercase for atoms, uppercase for nested lists) |
| f | foreign key link (if any) |
| a | applied attributes (e.g., s for sorted, u for unique) |
You can rename and reorder columns using xcol and xcols, respectively:
q
q)([name: `nickname]) xcol t
nickname phone age
-----------------------
Alice "555-0101" 23
Bob "555-0723" 32
Mike "555-6666" 22
q)`name`age xcols t
name age phone
--------------------
Alice 23 "555-0101"
Bob 32 "555-0723"
Mike 22 "555-6666"
pandas
>>> t.rename(columns={'name': 'nickname'})
nickname phone age
0 Alice 555-0101 23
1 Bob 555-0723 32
2 Mike 555-6666 22
>>> t[['name', 'age'] + [ c for c in t.columns if c not in ['name', 'age']]]
name age phone
0 Alice 23 555-0101
1 Bob 32 555-0723
2 Mike 22 555-6666
Appending Records
Functions insert and upsert append a single record or a batch of records to a table in-place. insert throws an error if the table does not exist.
q
q)t,:(`John; "555-5382"; 40) / append a single row
q)`t insert (`George`David; ("555-4000"; "555-4001"); 18 19) / in-place batch append
4 5
q)`t upsert ((`Phil;"555-2341"; 44); (`Ryan;"555-6267";55))
`t
q)t
name phone age
---------------------
Alice "555-0101" 23
Bob "555-0723" 32
Mike "555-6666" 22
George "555-4000" 18
David "555-4001" 19
Phil "555-2341" 44
Ryan "555-6267" 55
pandas
>>> t = pd.concat([t, pd.DataFrame(dict(zip(t.columns, [['George', 'David'], ['555-4000', '555-4001'], [18, 19]])))], ignore_index=True)
>>> t = pd.concat([t, pd.DataFrame([['Phil', '555-2341', 44], ['Ryan', '555-6267', 55]], columns=t.columns)], ignore_index=True)
>>> t
name phone age
0 Alice 555-0101 23
1 Bob 555-0723 32
2 Mike 555-6666 22
4 George 555-4000 18
5 David 555-4001 19
6 Phil 555-2341 44
7 Ryan 555-6267 55
Because tables are lists, you can also append data using list concatenation (,) — the simplest form of a join operation — combined with variable assignment:
q
q)t,:(`John; "555-5382"; 40) / append a single row
pandas
>>> t = pd.concat([t, pd.DataFrame([['John', '555-5382', 40]], columns=t.columns)], ignore_index=True)
List operations
Since tables are lists, most list operations apply to tables. For a full reference, see list operations; only the most frequently used operations are listed here.
Number of rows: count
The count function returns the number of rows in a table:
q
q)t:([] name: `Alice`Bob`Mike; phone: ("555-0101"; "555-0723"; "555-6666"); age: 23 32 22)
q)count t
3
pandas
>>> t = pd.DataFrame({'name': ['Alice', 'Bob', 'Mike'], 'phone': ['555-0101', '555-0723', '555-6666'], 'age': [23, 32, 22]})
>>> len(t)
3
Indexing
Tables support both positional and label-based indexing:
q
q)t:([] name: `Alice`Bob`Mike; phone: ("555-0101"; "555-0723"; "555-6666"); age: 23 32 22)
q)t 2 / get the third record
name | `Mike
phone| "555-6666"
age | 22
q)t `name / get the 'name' column
`Alice`Bob`Mike
q)t[2; `name] / get the 'name' value of the third row
`Mike
q)t . (2;`name) / index at depth by a list
`Mike
q)t 2 0 / index by list
name phone age
--------------------
Mike "555-6666" 22
Alice "555-0101" 23
q)first t
name | `Alice
phone| "555-0101"
age | 23
pandas
>>> t = pd.DataFrame({'name': ['Alice', 'Bob', 'Mike'], 'phone': ['555-0101', '555-0723', '555-6666'], 'age': [23, 32, 22]})
>>> t.iloc[2]
name Mike
phone 555-6666
age 22
Name: 2, dtype: object
>>> t['name']
0 Alice
1 Bob
2 Mike
Name: name, dtype: object
>>> t.at[2, 'name']
'Mike'
>>> t.iloc[[2, 0]]
name phone age
2 Mike 555-6666 22
0 Alice 555-0101 23
>>> t.iloc[0]
name Alice
phone 555-0101
age 23
Name: 0, dtype: object
Indexed Assignment
You can overwrite any item by indexing:
q
q)t:([] name: `Alice`Bob`Mike; phone: ("555-0101"; "555-0723"; "555-6666"); age: 23 32 22)
q)t[1]: (`John; "555-5382"; 40)
q)t
name phone age
--------------------
Alice "555-0101" 23
John "555-5382" 40
Mike "555-6666" 22
pandas
>>> t = pd.DataFrame({'name': ['Alice', 'Bob', 'Mike'], 'phone': ['555-0101', '555-0723', '555-6666'], 'age': [23, 32, 22]})
>>> t.iloc[1] = ['John', '555-5382', 40]
>>> t
name phone age
0 Alice 555-0101 23
1 John 555-5382 40
2 Mike 555-6666 22
or by using general amend at:
q)@[t;2;:;(`Phil;"555-2341"; 44)]
name phone age
--------------------
Alice "555-0101" 23
John "555-5382" 40
Phil "555-2341" 44
Typically, an update statement is used to modify (or get a new copy of) a table. Combining an update with a where clause provides a convenient and readable solution.
Sub-tables
You can extract sub-tables using the sublist function and the take operator. The inverse operator is _:
q
q)t:([] name: `Alice`Bob`Mike; phone: ("555-0101"; "555-0723"; "555-6666"); age: 23 32 22)
q)2 sublist t / sub-table
name phone age
--------------------
Alice "555-0101" 23
Bob "555-0723" 32
q)-2 sublist t / last two records
name phone age
-------------------
Bob "555-0723" 32
Mike "555-6666" 22
q)2 _ t / delete first two records
name phone age
-------------------
Mike "555-6666" 22
q)`name`age cut t / delete columns
phone
----------
"555-0101"
"555-0723"
"555-6666"
q)t _ 1 / delete records at index 1
name phone age
--------------------
Alice "555-0101" 23
Mike "555-6666" 22
pandas
>>> t = pd.DataFrame({'name': ['Alice', 'Bob', 'Mike'], 'phone': ['555-0101', '555-0723', '555-6666'], 'age': [23, 32, 22]})
>>> t.head(2)
name phone age
0 Alice 555-0101 23
1 Bob 555-0723 32
>>> t.tail(2)
name phone age
1 Bob 555-0723 32
2 Mike 555-6666 22
>>> t.iloc[2:]
name phone age
2 Mike 555-6666 22
>>> t.drop(columns=['name', 'age'])
phone
0 555-0101
1 555-0723
2 555-6666
>>> t.drop(index=1)
name phone age
0 Alice 555-0101 23
2 Mike 555-6666 22
Delete statements are another convenient way to delete rows from a table.
The take function is similar to sublist, but wraps around if the requested count exceeds the table length:
q)5#t
name phone age
--------------------
Alice "555-0101" 23
Bob "555-0723" 32
Mike "555-6666" 22
Alice "555-0101" 23
Bob "555-0723" 32
You can also take selected columns:
q
q)`name`age#t
name age
---------
Alice 23
Bob 32
Mike 22
pandas
>>> t[['name', 'age']]
name age
0 Alice 23
1 Bob 32
2 Mike 22
Arithmetic operations
All arithmetic operations apply to tables containing numeric columns. Operators are applied element-wise.
q
q)show t1: flip `a`b!3 cut til 6
a b
---
0 3
1 4
2 5
q)2*t1
a b
----
0 6
2 8
4 10
q)show t2: flip `a`b!3 cut 6?20
a b
-----
12 1
8 9
10 11
q)t1 + t2
a b
-----
12 4
9 13
12 16
q)neg t1
a b
-----
0 -3
-1 -4
-2 -5
pandas
>>> t1 = pd.DataFrame(np.arange(6).reshape(2, 3).T, columns=['a', 'b'])
>>> t1
a b
0 0 3
1 1 4
2 2 5
>>> 2 * t1
a b
0 0 6
1 2 8
2 4 10
>>> t2 = pd.DataFrame(np.random.randint(0, 20, size=(2, 3)).T, columns=['a', 'b'])
>>> t2
a b
0 12 1
1 8 9
2 10 11
>>> t1 + t2
a b
0 12 4
1 9 13
2 12 16
>>> -t1
a b
0 0 -3
1 -1 -4
2 -2 -5
Agregations are executed column-wise and result normal dictionaries:
q
q)avg t1
a| 1
b| 4
pandas
>>> t1.mean()
a 1.0
b 4.0
dtype: float64
Replace null values: Coalesce (^)
The coalesce operator ^ is related to , in that it employs upsert semantics to merge two tables with right prevailing over left. The difference is that null values in the right operand do not prevail over those in the left.
q
q)show t:([] a: 1 0N; b: 0N 4)
a b
---
1
4
q)0^t
a b
---
1 0
0 4
q)([] a: 10 2; b: 3 10) ^ t
a b
---
1 3
2 4
pandas
>>> t = pd.DataFrame({'a': [1, None], 'b': [None, 4]}).astype('Int64')
>>> t
a b
0 1 <NA>
1 <NA> 4
>>> t.fillna(0)
a b
0 1 0
1 0 4
>>> t.combine_first(pd.DataFrame({'a': [10, 2], 'b': [3, 10]}))
a b
0 1 3
1 2 4
Reordering
Tables are lists, and there are multiple ways to reorder a list:
q
q)t:([] name: `Alice`Bob`Mike; phone: ("555-0101"; "555-0723"; "555-6666"); age: 23 32 22)
q)reverse t
name phone age
--------------------
Mike "555-6666" 22
Bob "555-0723" 32
Alice "555-0101" 23
q)0N?t / random permutation
name phone age
--------------------
Bob "555-0723" 32
Alice "555-0101" 23
Mike "555-6666" 22
pandas
>>> t = pd.DataFrame({'name': ['Alice', 'Bob', 'Mike'], 'phone': ['555-0101', '555-0723', '555-6666'], 'age': [23, 32, 22]})
>>> t.iloc[::-1].reset_index(drop=True)
name phone age
0 Mike 555-6666 22
1 Bob 555-0723 32
2 Alice 555-0101 23
>>> t.sample(frac=1).reset_index(drop=True)
name phone age
0 Bob 555-0723 32
1 Alice 555-0101 23
2 Mike 555-6666 22
Functions asc and desc sort lexicographically by all columns:
q
q)asc ([]a:4 3 4; b:`s`a`d)
a b
---
3 a
4 d
4 s
pandas
>>> pd.DataFrame({'a': [4, 3, 4], 'b': ['s', 'a', 'd']}).sort_values(by=['a', 'b'])
a b
1 3 a
2 4 d
0 4 s
To sort by selected columns, use xasc and xdesc:
q
q)`a xasc ([]a:4 3 4; b:`s`a`d)
a b
---
3 a
4 s
4 d
pandas
>>> pd.DataFrame({'a': [4, 3, 4], 'b': ['s', 'a', 'd']}).sort_values(by='a', kind='stable')
a b
1 3 a
0 4 s
2 4 d
The sort in q is stable: it preserves the relative order of equal elements.
Miscellaneous
Grouping by a field or multiple fields is a basic operation in data analysis:
q
q)t:([] name: `Alice`Bob`Mike; kids: (1#`Bill; (); `Maggie`Jack`Ellie))
q)show nt: ungroup t / convert to normal table, "flatten" list fields
name kids
------------
Alice Bill
Mike Maggie
Mike Jack
Mike Ellie
q)`name xgroup nt / group by name
name | kids
-----| ------------------
Alice| ,`Bill
Mike | `Maggie`Jack`Ellie
pandas
>>> t = pd.DataFrame({'name': ['Alice', 'Bob', 'Mike'], 'kids': [['Bill'], [], ['Maggie', 'Jack', 'Ellie']]})
>>> nt = t.explode('kids').dropna(subset=['kids'])
>>> nt
name kids
0 Alice Bill
2 Mike Maggie
2 Mike Jack
2 Mike Ellie
>>> nt.groupby('name').agg(list)
kids
name
Alice [Bill]
Mike [Maggie, Jack, Ellie]
Set operations like inter, union, except and distinct handle tables as expected - the set items are table records. Furthermore, there are multiple ways to join tables.
Querying
q supports an SQL-like syntax for querying tables. See section Guides → Querying for examples and more details.
Next Steps
- For a deeper dive into tables, check out Q for Mortals §8. Tables.
- Learn how to create keyed tables from two tables.
- To load a CSV as a table, refer to Load CSV.
- Move your in-memory data to the filesystem - see Database: persisting tables in the filesystem.
- Improve the performance of list operations by applying attributes.