Database queries
pykx.query
This page documents query interfaces for querying q tables using PyKX.
QSQL
QSQL(q)
The QSQL
class provides methods to query or modify q tables.
The methods select, exec, update and delete generate and execute functional queries on the given table. To learn about functionally querying databases see Chapter 9 Section 12 of Q for Mortals.
There are a number of advantages to using this query style as opposed to interpolating strings to generate simple qSQL queries:
- Users that are unfamiliar with q who use the interface are introduced to this more powerful version of querying with q, while still operating within a familiar setting in Python.
-
Using the functional form promotes data-oriented designs for modifying or querying the q tables programmatically using data derived from Python:
qtable = pykx.q('([]1 2 3;4 5 6)') pykx.q.qsql.update(qtable, {'x': [10, 20, 30]})
-
Development and maintenance of this interface is easier with regard to the different supported table formats.
select
select(table, columns=None, where=None, by=None, inplace=False)
Execute a q functional select statement on tables defined within the process.
This implementation follows the q functional select syntax with limited support on structures used in the parameters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
table |
Union[k.Table, str]
|
The q table or name of the table to query. The table must be named within the q memory space. |
required |
columns |
Optional[Union[Dict[str, str], k.Dictionary]]
|
A dictionary where the keys are names assigned for the query's output columns and the values are the logic used to compute the column's result. |
None
|
where |
Optional[Union[List[str], str, k.SymbolAtom, k.SymbolVector]]
|
Filtering logic for reducing the data used in group-bys and output column aggregations. |
None
|
by |
Optional[Union[Dict[str, str], k.Dictionary]]
|
A dictionary where they keys are names assigned for the produced columns and the values are aggregation rules used to construct the group-by parameter. |
None
|
inplace |
bool
|
Indicates if the result of an update is to be persisted. This applies to tables referenced by name in q memory or general table objects https://code.kx.com/q/basics/qsql/#result-and-side-effects. |
False
|
Returns:
Type | Description |
---|---|
k.K
|
A PyKX Table or KeyedTable object resulting from the executed select query |
Examples:
Define a q table in python, and give it a name in q memory
qtab = pykx.q('([]col1:100?`a`b`c;col2:100?1f;col3:100?0b;col4:100?10f)')
pykx.q['qtab'] = qtab
Select all items in the table
pykx.q.qsql.select(qtab)
pykx.q.qsql.select('qtab')
Filter table based on various where conditions
pykx.q.qsql.select(qtab, where='col2<0.5')
pykx.q.qsql.select(qtab, where=['col1=`a', 'col2<0.3'])
Retrieve statistics by grouping data on symbol columns
pykx.q.qsql.select(qtab, columns={'maxCol2': 'max col2'}, by={'col1': 'col1'})
pykx.q.qsql.select(qtab, columns={'avgCol2': 'avg col2', 'minCol4': 'min col4'}, by={'col1': 'col1'})
Retrieve grouped statistics with restrictive where condition
pykx.q.qsql.select(qtab, columns={'avgCol2': 'avg col2', 'minCol4': 'min col4'}, by={'col1': 'col1'}, where='col3=0b')
exec
exec(table, columns=None, where=None, by=None)
Execute a q functional exec statement on tables defined within the process.
This implementation follows the q functional exec syntax with limited support on structures used for the parameters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
table |
Union[k.Table, str]
|
The q table or name of the table to query. The table must be named within the q memory space. |
required |
columns |
Optional[Union[Dict[str, str], k.Dictionary]]
|
A dictionary where the keys are names assigned to the query's output columns and the values are the logic used to compute the column's result. |
None
|
where |
Optional[Union[List[str], str, k.SymbolAtom, k.SymbolVector]]
|
Filtering logic for reducing the data used in group-by and output column aggregations. |
None
|
by |
Optional[Union[Dict[str, str], k.Dictionary]]
|
A dictionary where they keys are names assigned to the produced columns and the values are aggregation rules used when q functionally applies group-by. |
None
|
Returns:
Type | Description |
---|---|
k.K
|
A PyKX Vector or Dictionary object resulting from the executed exec query |
Examples:
Define a q table in python and named in q memory
qtab = pykx.Table(data={
'col1': pykx.random.random(100, ['a', 'b', 'c']),
'col2': pykx.random.random(100, 100),
'col3': pykx.random.random(100, [0, 1]),
'col4': pykx.random.random(100, 100.0)
})
pykx.q['qtab'] = qtab
Select last item of the table
pykx.q.qsql.exec(qtab)
pykx.q.qsql.exec('qtab')
Retrieve a column from the table as a list
pykx.q.qsql.exec(qtab, 'col3')
Retrieve a set of columns from a table as a dictionary
pykx.q.qsql.exec(qtab, {'symcol': 'col1'})
pykx.q.qsql.exec(qtab, {'symcol': 'col1', 'boolcol': 'col3'})
Filter columns from a table based on various where conditions
pykx.q.qsql.exec(qtab, 'col3', where='col1=`a')
pykx.q.qsql.exec(qtab, {'symcol': 'col1', 'maxcol4': 'max col4'}, where=['col1=`a', 'col2<0.3'])
Retrieve data grouping by data on symbol columns
pykx.q.qsql.exec(qtab, 'col2', by={'col1': 'col1'})
pykx.q.qsql.exec(qtab, columns={'maxCol2': 'max col2'}, by={'col1': 'col1'})
pykx.q.qsql.exec(qtab, columns={'avgCol2': 'avg col2', 'minCol4': 'min col4'}, by={'col1': 'col1'})
Retrieve grouped statistics with restrictive where condition
pykx.q.qsql.exec(qtab, columns={'avgCol2': 'avg col2', 'minCol4': 'min col4'}, by={'col1': 'col1'}, where='col3=0b')
update
update(table, columns=None, where=None, by=None, inplace=False)
Execute a q style update statement on tables defined within the process.
This implementation follows the q functional update syntax with limited support on structures used for the parameters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
table |
Union[k.Table, str]
|
The q table or name of the table to update. The table must be named within the q memory space. |
required |
columns |
Optional[Union[Dict[str, str], k.Dictionary]]
|
A dictionary where the keys are names assigned to the query's output columns and the values are the logic used to compute the column's result. |
None
|
where |
Optional[Union[List[str], str, k.SymbolAtom, k.SymbolVector]]
|
Filtering logic for reducing the data used in group-bys and output column aggregations. |
None
|
by |
Optional[Union[Dict[str, str], k.Dictionary]]
|
A dictionary where they keys are names assigned to the result columns and the values are aggregation rules used to compute the group-by result. |
None
|
inplace |
bool
|
Whether the result of an update is to be persisted. This operates for tables referenced by name in q memory or general table objects https://code.kx.com/q/basics/qsql/#result-and-side-effects. |
False
|
Returns:
Type | Description |
---|---|
k.K
|
The updated PyKX Table or KeyedTable object resulting from the executed update query |
Examples:
Define a q table in python and named in q memory
qtab = pykx.Table(
[
['tom', 28, 'fair', 'green'],
['dick', 29, 'dark', 'brown'],
['harry', 35, 'fair', 'gray']
],
columns=['name', 'age', 'hair', 'eye']
)
pykx.q['qtab'] = qtab
Update all the contents of a column
pykx.q.qsql.update(qtab, {'eye': '`blue`brown`green'})
pykx.q.qsql.update(qtab, {'age': [25, 30, 31]})
Update the content of a column restricting scope using a where clause
pykx.q.qsql.update(qtab, {'eye': ['blue']}, where='hair=`fair')
Define a q table suitable for by clause example
byqtab = pykx.q('([]p:100?`a`b`c;name:100?`nut`bolt`screw;color:100?`red`green`blue;weight:0.5*100?20;city:100?`london`paris`rome)')
pykx.q['byqtab'] = byqtab
Apply an update grouping based on a by phrase
pykx.q.qsql.update(byqtab, {'weight': 'avg weight'}, by={'city': 'city'})
Apply an update grouping based on a by phrase and persist the result using the inplace keyword
python
pykx.q.qsql.update('byqtab', columns={'weight': 'avg weight'}, by={'city': 'city'}, inplace=True)
pykx.q['byqtab']
delete
delete(table, columns=None, where=None, inplace=False)
Execute a q functional delete statement on tables defined within the process.
This implementation follows the q functional delete syntax with limited support on structures used for the parameters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
table |
Union[k.Table, str]
|
The q table or name of the table (provided the table is named within the q memory space) on which the delete statement is to be applied. |
required |
columns |
Optional[Union[List[str], k.SymbolVector]]
|
Denotes the columns to be deleted from a table. |
None
|
where |
Optional[Union[List[str], str, k.SymbolAtom, k.SymbolVector]]
|
Conditional filtering used to select subsets of the data which are to be deleted from the table. |
None
|
inplace |
bool
|
Whether the result of an update is to be persisted. This operates for tables referenced by name in q memory or general table objects https://code.kx.com/q/basics/qsql/#result-and-side-effects. |
False
|
Returns:
Type | Description |
---|---|
k.K
|
The updated PyKX Table or KeyedTable object resulting from the executed delete query |
Examples:
Define a q table in python and named in q memory
qtab = pykx.q('([]name:`tom`dick`harry;age:28 29 35;hair:`fair`dark`fair;eye:`green`brown`gray)')
pykx.q['qtab'] = qtab
Delete all the contents of the table
pykx.q.qsql.delete(qtab)
pykx.q.qsql.delete('qtab')
Delete single and multiple columns from the table
pykx.q.qsql.delete(qtab, 'age')
pykx.q.qsql.delete('qtab', ['age', 'eye'])
Delete rows of the dataset based on where condition
pykx.q.qsql.delete(qtab, where='hair=`fair')
pykx.q.qsql.delete('qtab', where=['hair=`fair', 'age=28'])
Delete a column from the dataset named in q memory and persist the result using the inplace keyword
pykx.q.qsql.delete('qtab', 'age', inplace=True)
pykx.q['qtab']
SQL
SQL(q)
Wrapper around the KX Insights Core ANSI SQL interface.
Examples within this interface use a table named trades, an example of this table is
>>> trades = kx.Table(data={
'sym': kx.random.random(100, ['AAPL', 'GOOG', 'MSFT']),
'date': kx.random.random(100, kx.q('2022.01.01') + [0,1,2]),
'price': kx.random.random(100, 1000.0)
})
>>> kx.q['trades'] = trades
__call__
__call__(query, *args)
Compile and run a SQL statement using string interpolation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
query |
str
|
The query to execute formatted in KX Insights SQL style |
required |
*args |
Any
|
The arguments for the query, which will be interpolated into the string. Each argument will be converted into a pykx.K object. |
()
|
Returns:
Type | Description |
---|---|
k.Table
|
The result of the evaluation of |
Avoid interpolating the table name into the query when using this with IPC.
Use the full name of the table in the string.
When using this class on the embedded q process it is common to interpolate a
pykx.Table
object into a query using '$1'
. When the Q
object used in the initialization of this class is an IPC connection
the entire table will be sent in the message over the connection. If the table is large
this will significantly impact performance.
Examples:
Query a table by name:
>>> q.sql('select * from trades')
pykx.Table(pykx.q('
sym date price
------------------------
AAPL 2022.01.02 484.4727
AAPL 2022.01.02 682.7999
MSFT 2022.01.01 153.227
MSFT 2022.01.03 535.0923
..
'))
Query a [pykx.Table
][] instance by injecting it as the first argument using $n
syntax:
>>> q.sql('select * from $1', trades) # where `trades` is a `pykx.Table` object
pykx.Table(pykx.q('
sym date price
------------------------
AAPL 2022.01.02 484.4727
AAPL 2022.01.02 682.7999
MSFT 2022.01.01 153.227
MSFT 2022.01.03 535.0923
..
'))
Query a table using multiple injected arguments:
>>> q.sql('select * from trades where date = $1 and price < $2', date(2022, 1, 2), 500.0)
pykx.Table(pykx.q('
sym date price
------------------------
AAPL 2022.01.02 484.4727
MSFT 2022.01.02 457.328
GOOG 2022.01.02 8.062521
MSFT 2022.01.02 338.0097
..
'))
prepare
prepare(query, *args)
Prepare a parametrized query to be executed later.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
query |
str
|
The query to parameterize in KX Insights SQL format. |
required |
*args |
Any
|
The arguments for |
()
|
Returns:
Type | Description |
---|---|
k.List
|
The parametrized query, which can later be used with |
Examples:
Preparing a query does not require fully constructed K Atom and Vector types.
Both the value kx.LongAtom(1)
and the wrapper pykx.LongAtom
are valid. To determine table type use pykx.Table.prototype
.
Prepare a query for later execution that will expect a table with 3 columns a, b, and c with ktypes pykx.SymbolVector, pykx.FloatVector, and pykx.LongVector respectively.
>>> p = q.sql.prepare('select * from $1', kx.q('([] a:``; b: 0n 0n; c: 0N 0N)'))
You can also use the pykx.Table.prototype
helper function to build a table to
pass into a prepared SQL query.
>>> p = q.sql.prepare('select * from $1', kx.Table.prototype({
'a': kx.SymbolVector,
'b': kx.FloatVector,
'c': kx.LongVector})
)
Prepare a query for later execution that will take a date and a float as input at execution time to query the trades table.
>>> p = q.sql.prepare('select * from trades where date = $1 and price < $2',
date(1, 1, 2),
500.0
)
You can also directly pass in the pykx.K types you wish to use instead.
>>> p = q.sql.prepare('select * from trades where date = $1 and price < $2',
kx.DateAtom,
kx.FloatAtom
)
execute
execute(query, *args)
Execute a prepared query. Parameter types must match the types of the arguments
used when executing the sql.prepare
function.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
query |
k.List
|
A prepared SQL statement returned by a call to |
required |
*args |
Any
|
The arguments for the query, which will be interpolated into the query. Each argument will be converted into a pykx.K object. |
()
|
Returns:
Type | Description |
---|---|
k.K
|
The result of the evaluation of |
Avoid interpolating the table name into the query when using this with IPC.
Use the full name of the table in the string.
When using this class on the embedded q process it is common to interpolate a
pykx.Table
object into a query using '$1'
. When the Q
object used in the initialization of this class is an IPC connection
the entire table will be sent in the message over the connection. If the table is large
this will significantly impact performance.
Examples:
Execute a prepared query passing in a pykx.Table with 3 columns a, b, and c with ktypes pykx.SymbolVector, pykx.FloatVector, and pykx.LongVector respectively.
>>> p = q.sql.prepare('select * from $1', kx.q('([] a:``; b: 0n 0n; c: 0N 0N)'))
>>> q.sql.execute(p, kx.q('([] a:`a`b`c`d; b: 1.0 2.0 3.0 4.0; c: 1 2 3 4)'))
pykx.Table(pykx.q('
a b c
-----
a 1 1
b 2 2
c 3 3
d 4 4
'))
Execute a prepared query that takes a date and a float as input to query the trades table.
>>> p = q.sql.prepare('select * from trades where date = $1 and price < $2',
date(1, 1, 2),
500.0
)
>>> q.sql.execute(p, date(2022, 1, 2), 500.0)
pykx.Table(pykx.q('
sym date price
------------------------
AAPL 2022.01.02 484.4727
MSFT 2022.01.02 457.328
GOOG 2022.01.02 8.062521
MSFT 2022.01.02 338.0097
..
'))
get_input_types
get_input_types(prepared_query)
Get the pykx.K types that are expected to be used with a prepared query.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
prepared_query |
k.List
|
A prepared SQL statement returned by a call to |
required |
Returns:
Type | Description |
---|---|
List[str]
|
A Python list object containing the string representations of the expected K types for use with the prepared statement. |
Examples:
>>> p = q.sql.prepare('select * from trades where date = $1 and price < $2',
date(1, 1, 1),
0.0
)
>>> q.sql.get_input_types(p)
['DateAtom/DateVector', 'FloatAtom/FloatVector']
TableAppend
TableAppend(_q)
Helper class for the q insert and upsert functions
Insert
Insert(_q)
Bases: TableAppend
Helper class for the q insert function
__call__
__call__(table, row, match_schema=False, test_insert=False)
Helper function around q's insert
function which inserts a row or multiple rows into
a q table object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
table |
Union[str, k.SymbolAtom]
|
The name of the table for the insert operation. |
required |
row |
Union[List, k.List]
|
A list of objects to be inserted as a row, or a list of lists of objects to insert multiple rows at once. |
required |
match_schema |
bool
|
Whether the row/rows to be inserted must match the table's current schema. |
False
|
test_insert |
bool
|
Causes the function to modify a small local copy of the table and return the modified example, this can only be used with embedded q and will not modify the source table's contents. |
False
|
Returns:
Type | Description |
---|---|
Union[None, k.Table]
|
When |
Raises:
Type | Description |
---|---|
PyKXException
|
If the |
Examples:
Insert a single row onto a table named tab
ensuring that the row matches the
table's schema. This will raise an error if the row does not match.
>>> q.insert('tab', [1, 2.0, datetime.datetime(2020, 2, 24)], match_schema=True)
Insert multiple rows onto a table named tab
ensuring that each of the rows being
added match the table's schema.
>>> q.insert(
'tab',
[[1, 2], [2.0, 4.0], [datetime.datetime(2020, 2, 24), datetime.datetime(2020,3 , 19)]],
match_schema=True
)
Run a test insert to modify a local copy of the table to test what the table would look like after inserting the new rows.
>>> kx.q['tab'] = kx.Table([[1, 1.0, 'a'], [2, 2.0, 'b'], [3, 3.0, 'c']], columns=['a', 'b', 'c'])
>>> kx.q.insert('tab', [4, 4.0, 'd'], test_insert=True) # example of table after insert
pykx.Table(pykx.q('
a b c
-----
1 1 a
2 2 b
3 3 c
4 4 d
'))
>>> kx.q('tab') # table object was not modified
pykx.Table(pykx.q('
a b c
-----
1 1 a
2 2 b
3 3 c
'))
Upsert
Upsert(_q)
Bases: TableAppend
Helper class for the q upsert function
__call__
__call__(table, row, match_schema=False, test_insert=False)
Helper function around q's upsert
function which inserts a row or multiple rows into
a q table object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
table |
Union[str, k.SymbolAtom, k.Table]
|
A |
required |
row |
Union[List, k.List]
|
A list of objects to be appended as a row, if the table is within embedded q you may also pass in a table object to be upserted. |
required |
match_schema |
bool
|
Whether the row/rows to be appended must match the table's current schema. |
False
|
test_insert |
bool
|
Causes the function to modify a small local copy of the table and return the modified example, this can only be used with embedded q and will not modify the source table's contents. |
False
|
Returns:
Type | Description |
---|---|
Union[None, k.Table]
|
When |
Raises:
Type | Description |
---|---|
PyKXException
|
If the |
Examples:
Upsert a single row onto a table named tab
ensuring that the row matches the
table's schema. This will raise an error if the row does not match.
>>> q.upsert('tab', [1, 2.0, datetime.datetime(2020, 2, 24)], match_schema=True)
>>> table = q.upsert(table, [1, 2.0, datetime.datetime(2020, 2, 24)], match_schema=True)
Upsert multiple rows onto a table named tab
ensuring that each of the rows being
added match the table's schema.
>>> q.upsert(
'tab',
q('([] a: 1 2; b: 1.0 2.0; c: `b`c)'),
match_schema=True
)
>>> table = q.upsert(
table,
[[1, 2], [2.0, 4.0], [datetime.datetime(2020, 2, 24), datetime.datetime(2020,3 , 19)]],
match_schema=True
)
Run a test upsert to modify a local copy of the table to test what the table would look like after appending the new rows.
>>> kx.q['tab'] = kx.Table([[1, 1.0, 'a'], [2, 2.0, 'b'], [3, 3.0, 'c']], columns=['a', 'b', 'c'])
>>> kx.q.upsert('tab', [4, 4.0, 'd'], test_insert=True) # example of table after insert
pykx.Table(pykx.q('
a b c
-----
1 1 a
2 2 b
3 3 c
4 4 d
'))
>>> kx.q('tab') # table object was not modified
pykx.Table(pykx.q('
a b c
-----
1 1 a
2 2 b
3 3 c
'))