Querying

`pykx.query`

Query interfaces for PyKX.

`QSQL`

QSQL(q)

Generates and submits functional q SQL queries.

Instances of this class can be accessed as the qsql attribute of any pykx.Q. For instance, pykx.q.qsql, or pykx.QConnection(...).qsql.

The QSQL class provides Python users with a method of querying q simple, keyed, splayed and partitioned tables using a single set of functionality.

This is achieved by wrapping the logic contained within the q functional select, exec, update, and delete functionality. For more information on this functionality please refer to Chapter 9 Section 12 of Q for Mortals.

While it is also conceivable that the interface could compile a qSQL statement to achieve the same end goal there are a number of advantages to using the more complex functional form.

Users that are unfamiliar with q who use the interface are introduced to the more powerful version of querying with q, while still operating within a familiar setting.
Using the functional form provides the ability when running functional updates to update the q tables with data derived from Python:
```
qtable = pykx.q('([]1 2 3;4 5 6)')
pykx.q.qsql.update(qtable, {'x': [10, 20, 30]})
```
It makes development and maintenance of the interface easier when dealing across the forms of supported table within q within which the functional forms of interacting with tables are more natural.

select

select(table, columns=None, where=None, by=None, inplace=False)

Apply a q style select statement on tables defined within the process.

This implementation follows the q functional select syntax with limitations on structures supported for the various clauses a result of this.

Parameters:

Name	Type	Description	Default
`table`	`Union[k.Table, str]`	The q table or name of the table (provided the table is named within the q memory space) on which the select statement is to be applied.	required
`columns`	`Optional[Union[Dict[str, str], k.Dictionary]]`	A dictionary mapping the name to be given to a column and the logic to be applied in aggregation to that column both as strings.	`None`
`where`	`Optional[Union[List[str], str, k.SymbolAtom, k.SymbolVector]]`	Conditional filtering used to select subsets of the data on which by-clauses and appropriate aggregations are to be applied.	`None`
`by`	`Optional[Union[Dict[str, str], k.Dictionary]]`	A dictionary mapping the names to be assigned to the produced columns and the columns whose results are used to construct the groups of the by clause.	`None`
`inplace`	`bool`	Whether the result of an update is to be persisted. This operates for tables referenced by name in q memory or general table objects https://code.kx.com/q/basics/qsql/#result-and-side-effects.	`False`

Examples:

Define a q table in python, and give it a name in q memory

qtab = pykx.q('([]col1:100?`a`b`c;col2:100?1f;col3:100?0b;col4:100?10f)')
pykx.q['qtab'] = qtab

Select all items in the table

pykx.q.qsql.select(qtab)
pykx.q.qsql.select('qtab')

Filter table based on various where conditions

pykx.q.qsql.select(qtab, where='col2<0.5')
pykx.q.qsql.select(qtab, where=['col1=`a', 'col2<0.3'])

Retrieve statistics by grouping data on symbol columns

pykx.q.qsql.select(qtab, columns={'maxCol2': 'max col2'}, by={'col1': 'col1'})
pykx.q.qsql.select(qtab, columns={'avgCol2': 'avg col2', 'minCol4': 'min col4'}, by={'col1': 'col1'})

Retrieve grouped statistics with restrictive where condition

pykx.q.qsql.select(qtab, columns={'avgCol2': 'avg col2', 'minCol4': 'min col4'}, by={'col1': 'col1'}, where='col3=0b')

exec

exec(table, columns=None, where=None, by=None)

Apply a q style exec statement on tables defined within the process.

This implementation follows the q functional exec syntax with limitations on structures supported for the various clauses a result of this.

Parameters:

Name	Type	Description	Default
`table`	`Union[k.Table, str]`	The q table or name of the table (provided the table is named within the q memory space) on which the exec statement is to be applied.	required
`columns`	`Optional[Union[Dict[str, str], k.Dictionary]]`	A dictionary mapping the name to be given to a column and the logic to be applied in aggregation to that column both as strings. A string defining a single column to be retrieved from the table as a list.	`None`
`where`	`Optional[Union[List[str], str, k.SymbolAtom, k.SymbolVector]]`	Conditional filtering used to select subsets of the data on which by clauses and appropriate aggregations are to be applied.	`None`
`by`	`Optional[Union[Dict[str, str], k.Dictionary]]`	A dictionary mapping the names to be assigned to the produced columns and the the columns whose results are used to construct the groups of the by clause.	`None`

Examples:

Define a q table in python and named in q memory

pykx.q['qtab'] = pd.DataFrame.from_dict({
    'col1': [['a', 'b', 'c'][randint(0, 2)] for _ in range(100)],
    'col2': [random() for _ in range(100)],
    'col3': [randint(0, 1) == 1 for _ in range(100)],
    'col4': [random() * 10 for _ in range(100)]
})

Select last item of the table

pykx.q.qsql.exec(qtab)
pykx.q.qsql.exec('qtab')

Retrieve a column from the table as a list

pykx.q.qsql.exec(qtab, 'col3')

Retrieve a set of columns from a table as a dictionary

pykx.q.qsql.exec(qtab, {'symcol': 'col1'})
pykx.q.qsql.exec(qtab, {'symcol': 'col1', 'boolcol': 'col3'})

Filter columns from a table based on various where conditions

pykx.q.qsql.exec(qtab, 'col3', where='col1=`a')
pykx.q.qsql.exec(qtab, {'symcol': 'col1', 'maxcol4': 'max col4'}, where=['col1=`a', 'col2<0.3'])

Retrieve data grouping by data on symbol columns

pykx.q.qsql.exec(qtab, 'col2', by={'col1': 'col1'})
pykx.q.qsql.exec(qtab, columns={'maxCol2': 'max col2'}, by={'col1': 'col1'})
pykx.q.qsql.exec(qtab, columns={'avgCol2': 'avg col2', 'minCol4': 'min col4'}, by={'col1': 'col1'})

Retrieve grouped statistics with restrictive where condition

pykx.q.qsql.exec(qtab, columns={'avgCol2': 'avg col2', 'minCol4': 'min col4'}, by={'col1': 'col1'}, where='col3=0b')

update

update(table, columns=None, where=None, by=None, modify=False, inplace=False)

Apply a q style update statement on tables defined within the process.

This implementation follows the q functional update syntax with limitations on structures supported for the various clauses a result of this.

Parameters:

Name	Type	Description	Default
`table`	`Union[k.Table, str]`	The q table or name of the table (provided the table is named within the q memory space) on which the update statement is to be applied.	required
`columns`	`Optional[Union[Dict[str, str], k.Dictionary]]`	A dictionary mapping the name of a column present in the table or one to be added to the contents which are to be added to the column, this content can be a string denoting q data or the equivalent Python data.	`None`
`where`	`Optional[Union[List[str], str, k.SymbolAtom, k.SymbolVector]]`	Conditional filtering used to select subsets of the data on which by-clauses and appropriate aggregations are to be applied.	`None`
`by`	`Optional[Union[Dict[str, str], k.Dictionary]]`	A dictionary mapping the names to be assigned to the produced columns and the columns whose results are used to construct the groups of the by clause.	`None`
`modify`	`bool`	`Deprecated`, please use `inplace` instead. Whether the result of an update is to be saved. This operates for tables referenced by name in q memory or general table objects https://code.kx.com/q/basics/qsql/#result-and-side-effects.	`False`
`inplace`	`bool`	Whether the result of an update is to be persisted. This operates for tables referenced by name in q memory or general table objects https://code.kx.com/q/basics/qsql/#result-and-side-effects.	`False`

Examples:

Define a q table in python and named in q memory

qtab = pykx.Table(
    [
        ['tom', 28, 'fair', 'green'],
        ['dick', 29, 'dark', 'brown'],
        ['harry', 35, 'fair', 'gray']
    ],
    columns=['name', 'age', 'hair', 'eye']
)
pykx.q['qtab'] = qtab

Update all the contents of a column

pykx.q.qsql.update(qtab, {'eye': '`blue`brown`green'})
pykx.q.qsql.update(qtab, {'age': [25, 30, 31]})

Update the content of a column restricting scope using a where clause

pykx.q.qsql.update(qtab, {'eye': ['blue']}, where='hair=`fair')

Define a q table suitable for by clause example

byqtab = pykx.q('([]p:100?`a`b`c;name:100?`nut`bolt`screw;color:100?`red`green`blue;weight:0.5*100?20;city:100?`london`paris`rome)')
pykx.q['byqtab'] = byqtab

Apply an update grouping based on a by phrase

pykx.q.qsql.update(byqtab, {'weight': 'avg weight'}, by={'city': 'city'})

Apply an update grouping based on a by phrase and persist the result using the modify keyword

python pykx.q.qsql.update('byqtab', columns={'weight': 'avg weight'}, by={'city': 'city'}, inplace=True) pykx.q['byqtab']

delete

delete(table, columns=None, where=None, modify=False, inplace=False)

Apply a q style delete statement on tables defined within the process.

This implementation follows the q functional delete syntax with limitations on structures supported for the various clauses a result of this.

Parameters:

Name	Type	Description	Default
`table`	`Union[k.Table, str]`	The q table or name of the table (provided the table is named within the q memory space) on which the delete statement is to be applied.	required
`columns`	`Optional[Union[List[str], k.SymbolVector]]`	Denotes the columns to be deleted from a table.	`None`
`where`	`Optional[Union[List[str], str, k.SymbolAtom, k.SymbolVector]]`	Conditional filtering used to select subsets of the data which are to be deleted from the table.	`None`
`modify`	`bool`	`Deprecated`, please use `inplace` instead. Whether the result of a delete is to be saved. This holds when `table` is the name of a table in q memory, as outlined at: https://code.kx.com/q/basics/qsql/#result-and-side-effects.	`False`
`inplace`	`bool`	Whether the result of an update is to be persisted. This operates for tables referenced by name in q memory or general table objects https://code.kx.com/q/basics/qsql/#result-and-side-effects.	`False`

Examples:

Define a q table in python and named in q memory

qtab = pykx.q('([]name:`tom`dick`harry;age:28 29 35;hair:`fair`dark`fair;eye:`green`brown`gray)')
pykx.q['qtab'] = qtab

Delete all the contents of the table

pykx.q.qsql.delete(qtab)
pykx.q.qsql.delete('qtab')

Delete single and multiple columns from the table

pykx.q.qsql.delete(qtab, 'age')
pykx.q.qsql.delete('qtab', ['age', 'eye'])

Delete rows of the dataset based on where condition

pykx.q.qsql.delete(qtab, where='hair=`fair')
pykx.q.qsql.delete('qtab', where=['hair=`fair', 'age=28'])

Delete a column from the dataset named in q memory and persist the result using the modify keyword

pykx.q.qsql.delete('qtab', 'age', modify=True)
pykx.q['qtab']

`SQL`

SQL(q)

Wrapper around the KX Insights Core ANSI SQL interface.

Lots of examples within this interface use a table named trades, an example of this table is

>>> kx.q['trades'] = kx.toq(
    pd.DataFrame.from_dict({
        'sym': [['AAPL', 'GOOG', 'MSFT'][randint(0, 2)] for _ in range(100)],
        'date': [[date(2022, 1, 1), date(2022, 1, 2), date(2022, 1, 3)][randint(0, 2)] for _ in range(100)],
        'price': [random() * 1000 for _ in range(100)]
    })
)

call

__call__(query, *args)

Compile and run a SQL statement.

Parameters:

Name	Type	Description	Default
`query`	`str`	The SQL query, using KX Insights Core SQL, documented at https://code.kx.com/insights/core/sql.html	required
`*args`	`Any`	The arguments for the query, which will be interpolated into the query. Each argument will be converted into a `pykx.K` object.	`()`

Returns:

Type	Description
`k.Table`	The result of the evaluation of `query` with `args` interpolated.

Avoid interpolating the table into the query when running over IPC.

It's common to interpolate a pykx.Table object into the query as '$1'. This works well when running embedded within the process, but when the Q instance is an IPC connection this will result in the entire table being sent over the connection, which will negatively impact performance. Instead, when running over IPC, write the name of the table (as defined in the connected q server) directly into the query.

Examples:

Query a table by name:

>>> q.sql('select * from trades')
pykx.Table(pykx.q('
sym  date       price
------------------------
AAPL 2022.01.02 484.4727
AAPL 2022.01.02 682.7999
MSFT 2022.01.01 153.227
MSFT 2022.01.03 535.0923
..
'))

Query a pykx.Table instance by interpolating it in as the first argument:

>>> q.sql('select * from $1', trades) # where `trades` is a `pykx.Table` object
pykx.Table(pykx.q('
sym  date       price
------------------------
AAPL 2022.01.02 484.4727
AAPL 2022.01.02 682.7999
MSFT 2022.01.01 153.227
MSFT 2022.01.03 535.0923
..
'))

Query a table using interpolated conditions:

>>> q.sql('select * from trades where date = $1 and price < $2', date(2022, 1, 2), 500.0)
pykx.Table(pykx.q('
sym  date       price
------------------------
AAPL 2022.01.02 484.4727
MSFT 2022.01.02 457.328
GOOG 2022.01.02 8.062521
MSFT 2022.01.02 338.0097
..
'))

prepare

prepare(query, *args)

Prepare a parametrized query to be executed later, the parameter types are deduced from the types of the arguments used to prepare the statement.

Parameters:

Name	Type	Description	Default
`query`	`str`	The SQL query, using KX Insights Core SQL, documented at https://code.kx.com/insights/core/sql.html	required
`*args`	`Any`	The arguments for the query, these arguments are not used in the query. They are used to determine the types of the parameters that will later be used as parameters when executing the query.	`()`

Returns:

Type	Description
`k.List`	The parametrized query, which can later be used with `q.query.execute()`

Examples:

When preparing a query with K types you don't have to fully construct one.

For example you can pass kx.LongAtom(1) as a value to the prepare function as well as just pykx.LongAtom. This only works for Atom and Vector types. There is also a helper function for tables that you can use called pykx.Table.prototype.

Prepare a query for later execution that will expect a table with 3 columns a, b, and c with ktypes pykx.SymbolVector, pykx.FloatVector, and pykx.LongVector respectively.

>>> p = q.sql.prepare('select * from $1', kx.q('([] a:``; b: 0n 0n; c: 0N 0N)'))

You can also use the pykx.Table.prototype helper function to build a table to pass into a prepared SQL query.

>>> p = q.sql.prepare('select * from $1', kx.Table.prototype({
    'a': kx.SymbolVector,
    'b': kx.FloatVector,
    'c': kx.LongVector})
)

Prepare a query for later execution that will take a date and a float as input at execution time to query the trades table.

>>> p = q.sql.prepare('select * from trades where date = $1 and price < $2',
    date(1, 1, 2),
    500.0
)

You can also directly pass in the pykx.K types you wish to use instead.

>>> p = q.sql.prepare('select * from trades where date = $1 and price < $2',
    kx.DateAtom,
    kx.FloatAtom
)

execute

execute(query, *args)

Execute a prepared query the parameter types must match the types of the arguments used in the prepare statement.

Parameters:

Name	Type	Description	Default
`query`	`k.List`	A prepared SQL statement returned by a call to `q.sql.prepare`.	required
`*args`	`Any`	The arguments for the query, which will be interpolated into the query. Each argument will be converted into a `pykx.K` object.	`()`

Returns:

Type	Description
`k.K`	The result of the evaluation of `query` with `args` interpolated.

Avoid interpolating the table into the query when running over IPC.

It's common to interpolate a pykx.Table object into the query as '$1'. This works well when running embedded within the process, but when the Q instance is an IPC connection this will result in the entire table being sent over the connection, which will negatively impact performance. Instead, when running over IPC, write the name of the table (as defined in the connected q server) directly into the query.

Examples:

Execute a prepared query passing in a pykx.Table with 3 columns a, b, and c with ktypes pykx.SymbolVector, pykx.FloatVector, and pykx.LongVector respectively.

>>> p = q.sql.prepare('select * from $1', kx.q('([] a:``; b: 0n 0n; c: 0N 0N)'))
>>> q.sql.execute(p, kx.q('([] a:`a`b`c`d; b: 1.0 2.0 3.0 4.0; c: 1 2 3 4)'))
pykx.Table(pykx.q('
a b c
-----
a 1 1
b 2 2
c 3 3
d 4 4
'))

Execute a prepared query that takes a date and a float as input to query the trades table.

>>> p = q.sql.prepare('select * from trades where date = $1 and price < $2',
    date(1, 1, 2),
    500.0
)
>>> q.sql.execute(p, date(2022, 1, 2), 500.0)
pykx.Table(pykx.q('
sym  date       price
------------------------
AAPL 2022.01.02 484.4727
MSFT 2022.01.02 457.328
GOOG 2022.01.02 8.062521
MSFT 2022.01.02 338.0097
..
'))

get_input_types

get_input_types(prepared_query)

Get the pykx.K types that are expected to be used with a prepared query.

Parameters:

Name	Type	Description	Default
`prepared_query`	`k.List`	A prepared SQL statement returned by a call to `q.sql.prepare`.	required

Returns:

Type	Description
`List[str]`	A Python list object containing the string representations of the expected K types for use with the prepared statement.

Examples:

>>> p = q.sql.prepare('select * from trades where date = $1 and price < $2',
    date(1, 1, 1),
    0.0
)
>>> q.sql.get_input_types(p)
['DateAtom/DateVector', 'FloatAtom/FloatVector']

`TableAppend`

TableAppend(_q)

Helper class for the q insert function

`Insert`

Insert(_q)

Bases: TableAppend

Helper class for the q insert and upsert functions

call

__call__(table, row, match_schema=False, test_insert=False)

Helper function around q's insert function which inserts a row or multiple rows into a q table object.

Parameters:

Name	Type	Description	Default
`table`	`Union[str, k.SymbolAtom]`	The name of the table to be inserted onto.	required
`row`	`Union[List, k.List]`	A list of objects to be inserted as a row, or a list of lists containing objects to insert multiple rows at once.	required
`match_schema`	`bool`	Whether the row/rows to be inserted must match the tables current schema.	`False`
`test_insert`	`bool`	Causes the function to modify a small local copy of the table and return the modified example, this can only be used with embedded q and will not modify the source tables contents.	`False`

Returns:

Type	Description
`Union[None, k.Table]`	A `k.LongVector` denoting the index of the rows that were inserted, unless the
`Union[None, k.Table]`	`test_insert` keyword argument is used in which case it returns the
`Union[None, k.Table]`	last 5 rows of the table with the new rows inserted onto the end, this does not modify
`Union[None, k.Table]`	the actual table object.

Raises:

Type	Description
`PyKXException`	If the `match_schema` parameter is used this function may raise an error if the row to be inserted does not match the tables schema. The error message will contain information about which columns did not match.

Examples:

Insert a single row onto a table named tab ensuring that the row matches the tables schema. Will raise an error if the row does not match

>>> q.insert('tab', [1, 2.0, datetime.datetime(2020, 2, 24)], match_schema=True)

Insert multiple rows onto a table named tab ensuring that each of the rows being added match the tables schema.

>>> q.insert(
    'tab',
    [[1, 2], [2.0, 4.0], [datetime.datetime(2020, 2, 24), datetime.datetime(2020,3 , 19)]],
    match_schema=True
)

Run a test insert to modify a local copy of the table to test what the table would look like after inserting the new rows.

>>> kx.q['tab'] = kx.Table([[1, 1.0, 'a'], [2, 2.0, 'b'], [3, 3.0, 'c']], columns=['a', 'b', 'c'])
>>> kx.q.insert('tab', [4, 4.0, 'd'], test_insert=True) # example of table after insert
pykx.Table(pykx.q('
a b c
-----
1 1 a
2 2 b
3 3 c
4 4 d
'))
>>> kx.q('tab') # table object was not modified
pykx.Table(pykx.q('
a b c
-----
1 1 a
2 2 b
3 3 c
'))

`Upsert`

Upsert(_q)

Bases: TableAppend

Helper class for the q upsert function

call

__call__(table, row, match_schema=False, test_insert=False)

Helper function around q's upsert function which inserts a row or multiple rows into a q table object.

Parameters:

Name	Type	Description	Default
`table`	`Union[str, k.SymbolAtom, k.Table]`	A `k.Table` object or the name of the table to be inserted onto.	required
`row`	`Union[List, k.List]`	A list of objects to be inserted as a row, if the table is within embedded q you may also pass in a table object to be upserted.	required
`match_schema`	`bool`	Whether the row/rows to be inserted must match the tables current schema.	`False`
`test_insert`	`bool`	Causes the function to modify a small local copy of the table and return the modified example, this can only be used with embedded q and will not modify the source tables contents.	`False`

Returns:

Type	Description
`Union[None, k.Table]`	The modified table if a `k.Table` is passed in, otherwise `None` is returned.
`Union[None, k.Table]`	If the `test_insert` keyword argument is used it returns the last 5 rows of the table
`Union[None, k.Table]`	with the new rows inserted onto the end, this does not modify the actual table object.

Raises:

Type	Description
`PyKXException`	If the `match_schema` parameter is used this function may raise an error if the row to be inserted does not match the tables schema. The error message will contain information about which columns did not match.

Examples:

Upsert a single row onto a table named tab ensuring that the row matches the tables schema. Will raise an error if the row does not match

>>> q.upsert('tab', [1, 2.0, datetime.datetime(2020, 2, 24)], match_schema=True)
>>> table = q.upsert(table, [1, 2.0, datetime.datetime(2020, 2, 24)], match_schema=True)

Upsert multiple rows onto a table named tab ensuring that each of the rows being added match the tables schema. Upserting multiple rows only works within embedded q.

>>> q.upsert(
    'tab',
    q('([] a: 1 2; b: 1.0 2.0; c: `b`c)'),
    match_schema=True
)
>>> table = q.upsert(
    table,
    [[1, 2], [2.0, 4.0], [datetime.datetime(2020, 2, 24), datetime.datetime(2020,3 , 19)]],
    match_schema=True
)

Run a test insert to modify a local copy of the table to test what the table would look like after inserting the new rows.

>>> kx.q['tab'] = kx.Table([[1, 1.0, 'a'], [2, 2.0, 'b'], [3, 3.0, 'c']], columns=['a', 'b', 'c'])
>>> kx.q.upsert('tab', [4, 4.0, 'd'], test_insert=True) # example of table after insert
pykx.Table(pykx.q('
a b c
-----
1 1 a
2 2 b
3 3 c
4 4 d
'))
>>> kx.q('tab') # table object was not modified
pykx.Table(pykx.q('
a b c
-----
1 1 a
2 2 b
3 3 c
'))