Initializing the library¶
Non-PyKX Requirements¶
For the purpose of this demonstration the following Python libraries/modules are required
import os
import shutil
import sys
from tempfile import mkdtemp
import numpy as np
import pandas as pd
import pyarrow as pa
Initialization¶
Once installed via pip, PyKX can be started by importing the module. This will initialize embedded q within the Python process if a valid q license is found (e.g. in $QHOME
or $QLIC
), or fall back to the unlicensed version if no such license is found. This notebook will use the licensed version of PyKX. To force the usage of the unlicensed version (and silence the warning that is raised when the fallback to the unlicensed version is employed) you can add --unlicensed
to the environment variable $QARGS
. $QARGS
can be set to a string of arguments which will be used to initialize the embedded q instance, as if you had used those arguments to start q from the command line.
import warnings
warnings.filterwarnings('ignore') # Do not copy, as we are skipping symlinking pyKX to QHOME the core insights libraries will not be copied over and will raise warnings
os.environ['IGNORE_QHOME'] = '1' # Ignore symlinking PyKX q libraries to QHOME
os.environ['PYKX_Q_LOADED_MARKER'] = '' # Only used here for running Notebook under mkdocs-jupyter during document generation.
import pykx as kx
Evaluating q code using embedded q¶
kx.q('1+1')
pykx.LongAtom(pykx.q('2'))
kx.q('1 2 3 4f')
pykx.FloatVector(pykx.q('1 2 3 4f'))
kx.q('([]2?1f;2?0Ng;2?0b)')
pykx.Table(pykx.q(' x x1 x2 -------------------------------------------------- 0.08388858 5ae7962d-49f2-404d-5aec-f7c8abbae288 0 0.1959907 5a580fb6-656b-5e69-d445-417ebfe71994 1 '))
kx.q('`a`b`c!(til 10;`a`b`c;5?"abc")')
pykx.Dictionary(pykx.q(' a| 0 1 2 3 4 5 6 7 8 9 b| `a`b`c c| "cbabc" '))
Creating q objects from Python objects¶
One of the strengths of the PyKX interface is the flexibility in the representations of objects that can be converted from a native Python representation to a q equivalent.
By default data formatted in Python using the following libraries can be converted to a q equivalent representation.
- python native types
- numpy
- pandas
- pyarrow
These are all facilitated through use of the K
method of the base q
class shown before as follows
Atomic Structures¶
pyAtomic = 1.5
npAtomic = np.float64(1.5)
pdAtomic = pd.Series([1.5])
paAtomic = pa.array([1.5])
print(kx.K(pyAtomic))
# print(kx.K(npAtomic))
# print(kx.K(pdAtomic))
# print(kx.K(paAtomic))
1.5
Array/Series Structures¶
pyArray = [1, 2.5, "abc", b'defg']
npArray = np.array([1, 2.5, "abc", b'defg'], dtype = object)
pdSeries = pd.Series([pyArray])
paArray = pa.array([1, 2, 3])
print(kx.K(pyArray))
# print(kx.K(npArray))
# print(kx.K(pdSeries))
# print(kx.K(paArray))
1 2.5 `abc "defg"
Tabular data¶
Round trip support for tabular data is presently supported for Pandas Dataframes and PyArrow tables
pdtable = pd.DataFrame({'col1': [1, 2],
'col2': [2., 3.],
'col3': ['Hello', 'World']})
patable = pa.Table.from_pandas(pdtable)
print(kx.K(pdtable))
# print(kx.K(patable))
col1 col2 col3 --------------- 1 2 Hello 2 3 World
Converting q to Python¶
All K objects support one or more of the following methods: py()
, np()
, pd()
or pa()
These methods provide an interface to the K object such that they can be converted to an analogous Python, Numpy, Pandas or PyArrow object respectively.
Whether the view is a copy or not varies:
- The 'py' property always provides a copy.
- The 'np' property does not copy unless the data cannot be interpreted by Numpy properly without changing it. For example, all temporal types in Numpy take 64 bits per item, so the 32 bit q temporal types must be copied to be represented as Numpy 'datetime64'/'timedelta64' elements. In cases where copying is unacceptable, the raw keyword argument can be set to true as demonstrated below.
- The 'pd' property leverages the 'np' property to create Pandas objects, as such the same restrictions apply to it.
- The 'pa' property leverages the 'pd' property to create PyArrow objects, as such the same restrictions apply to it.
Atomic Conversions¶
Define q items for conversion
qbool = kx.q('0b')
qguid = kx.q('"G"$"00000000-0000-0000-0000-000000000001"')
qreal = kx.q('1.5e')
qlong = kx.q('1234')
qsymb = kx.q('`test')
qchar = kx.q('"x"')
qtime = kx.q('00:00:01')
qtstamp = kx.q('rand 0p')
Convert the above items to a variety of the Python types. Change the method used to experiment as necessary
print(qbool.py())
print(qguid.pd())
print(qreal.np())
print(qlong.pa())
print(qsymb.py())
print(qchar.np())
print(qtime.pd())
print(qtstamp.np())
False 00000000-0000-0000-0000-000000000001 1.5 1234 test b'x' 0 days 00:00:01 2002-10-07T22:38:33.923665888
Vector Conversions¶
Define q items for conversion
qbool = kx.q('2?0b')
qguid = kx.q('2?0Ng')
qreal = kx.q('2?5e')
qlong = kx.q('2?100')
qsymb = kx.q('2?`4')
qchar = kx.q('"testing"')
qtime = kx.q('2?0t')
qtstamp = kx.q('2?0p')
Convert the above items to a variety of the Python types. Change the method used to experiment as necessary
print(qbool.py())
print(qguid.pd())
print(qreal.np())
print(qlong.pa())
print(qsymb.py())
print(qchar.np())
print(qtime.pd())
print(qtstamp.np())
[True, False] 0 409031f3-b19c-6770-ee84-6e9369c98697 1 52cb20d9-f12c-9963-2829-3c64d8d8cb14 dtype: object [4.6837516 1.3910608] [ 17, 23 ] ['lhkp', 'mgab'] [b't' b'e' b's' b't' b'i' b'n' b'g'] 0 0 days 07:57:14.764000 1 0 days 02:31:39.330000 dtype: timedelta64[ns] ['2001-08-21T22:42:03.240897056' '2002-06-11T11:57:24.452442976']
Dictionary conversions¶
Conversions between q dictionaries and Python are only supported for the py()
method, numpy, pandas and pyarrow do not have appropriate equivalent representations and as such are not supported.
qdict=kx.q('`x`y`z!(10?10e;10?0Ng;4?`2)')
qdict.py()
{'x': [0.24513360857963562, 1.6910432577133179, 3.941082239151001, 7.263141632080078, 9.216435432434082, 1.8095358610153198, 6.43463659286499, 2.907093048095703, 0.7347807288169861, 3.1595258712768555], 'y': [UUID('a23d02b7-7ea4-d431-4dec-30173d13cc9e'), UUID('f7f1c0ee-782c-5346-f5f7-b90ec6868d41'), UUID('fa85233b-245f-5516-0cb8-391ae1e5fadd'), UUID('dc8e54ba-e30b-ab29-2df0-3fb0535f58d1'), UUID('e7bd83c0-5e9c-d21b-88c5-bbf572156409'), UUID('e115a2a4-3462-beab-42ee-ccad989b8d69'), UUID('f070dffc-bc15-0163-c551-0eba88719767'), UUID('f5c0e3d5-be69-8aa4-5f34-4195bb747a24'), UUID('d2aa3cea-afed-2fe7-9a4f-68c6683d1163'), UUID('5db6e5a1-ed7b-cf6e-26ba-3a3ee9b14c64')], 'z': ['mk', 'mb', 'ej', 'oj']}
Table conversions¶
Conversions between q keyed and unkeyed tables to an appropriate Python representation are supported for the py()
, np()
, pd()
and pa()
methods.
Round trip conversions q -> Python -> q
are however only supported for Pandas and PyArrow. Conversions from numpy records are still to be completed and the most natural representation for a table in native python is a dictionary as such the conversion from python to q returns a q dictionary rather than a table
Define a q table containing all q data types for conversion
kx.q('N:5')
kx.q('gen_data:{@[;0;string]x#/:prd[x]?/:(`6;`6;0Ng;.Q.a),("xpdmnuvtbhijef"$\:0)}') # noqa
kx.q('dset_1D:gen_data[enlist N]')
kx.q('gen_names:{"dset_",/:x,/:string til count y}')
qtab = kx.q('flip (`$gen_names["tab";dset_1D])!N#\'dset_1D')
Convert the above table to a pandas dataframe and pyarrow table
print(qtab.pd())
print(qtab.pa())
dset_tab0 dset_tab1 dset_tab2 dset_tab3 \ 0 b'jkepoe' dhcabc 7d4a4a62-ee85-3957-d502-d30d5945bf8c b'n' 1 b'flnloj' mhodom 97022332-3c2c-c08f-763b-081206e55f36 b'u' 2 b'cmjana' mgpgga 344623c7-2767-067c-78f9-cbaed4e13927 b's' 3 b'llmeim' jokgkf c4cefb88-4673-1375-aa62-1002c9709b1a b'f' 4 b'obpbkc' kgndnf e94fbd1a-cacd-8756-12a1-747c257a637c b'l' dset_tab4 dset_tab5 dset_tab6 dset_tab7 \ 0 189 2003-01-26 07:07:44.858519280 2003-09-16 2001-06-01 1 178 2003-07-17 05:17:31.644279056 2000-02-25 2003-11-01 2 185 2001-09-28 19:15:35.311730952 2003-01-21 2002-02-01 3 0 2000-04-23 18:51:15.250207632 2000-05-22 2001-05-01 4 184 2001-09-10 13:16:47.035502048 2002-05-08 2000-12-01 dset_tab8 dset_tab9 dset_tab10 \ 0 0 days 17:50:23.583573102 0 days 06:36:00 0 days 03:10:56 1 0 days 05:41:27.927979230 0 days 17:15:00 0 days 05:51:06 2 0 days 13:36:35.802198350 0 days 17:05:00 0 days 20:02:25 3 0 days 10:14:45.688765347 0 days 04:16:00 0 days 05:32:26 4 0 days 18:29:29.243813306 0 days 23:13:00 0 days 05:25:07 dset_tab11 dset_tab12 dset_tab13 dset_tab14 \ 0 0 days 12:47:14.478000 False 4026 621935707 1 0 days 23:35:06.083000 False 27013 -985208788 2 0 days 15:45:01.412000 True -1548 1587469071 3 0 days 19:40:56.330000 False -24050 504811124 4 0 days 02:24:40.646000 False -10095 -328828822 dset_tab15 dset_tab16 dset_tab17 0 3849124023139760322 0.0 0.0 1 -7757222342666795390 0.0 0.0 2 3710706633562202461 0.0 0.0 3 -7398680130320750406 0.0 0.0 4 -1831778752393164281 0.0 0.0 pyarrow.Table dset_tab0: binary dset_tab1: string dset_tab2: extension<pykx.uuid<ArrowUUIDType>> dset_tab3: binary dset_tab4: uint8 dset_tab5: timestamp[ns] dset_tab6: timestamp[ns] dset_tab7: timestamp[ns] dset_tab8: duration[ns] dset_tab9: duration[ns] dset_tab10: duration[ns] dset_tab11: duration[ns] dset_tab12: bool dset_tab13: int16 dset_tab14: int32 dset_tab15: int64 dset_tab16: float dset_tab17: double ---- dset_tab0: [[6A6B65706F65,666C6E6C6F6A,636D6A616E61,6C6C6D65696D,6F6270626B63]] dset_tab1: [["dhcabc","mhodom","mgpgga","jokgkf","kgndnf"]] dset_tab2: [[7D4A4A62EE853957D502D30D5945BF8C,970223323C2CC08F763B081206E55F36,344623C72767067C78F9CBAED4E13927,C4CEFB8846731375AA621002C9709B1A,E94FBD1ACACD875612A1747C257A637C]] dset_tab3: [[6E,75,73,66,6C]] dset_tab4: [[189,178,185,0,184]] dset_tab5: [[2003-01-26 07:07:44.858519280,2003-07-17 05:17:31.644279056,2001-09-28 19:15:35.311730952,2000-04-23 18:51:15.250207632,2001-09-10 13:16:47.035502048]] dset_tab6: [[2003-09-16 00:00:00.000000000,2000-02-25 00:00:00.000000000,2003-01-21 00:00:00.000000000,2000-05-22 00:00:00.000000000,2002-05-08 00:00:00.000000000]] dset_tab7: [[2001-06-01 00:00:00.000000000,2003-11-01 00:00:00.000000000,2002-02-01 00:00:00.000000000,2001-05-01 00:00:00.000000000,2000-12-01 00:00:00.000000000]] dset_tab8: [[64223583573102,20487927979230,48995802198350,36885688765347,66569243813306]] dset_tab9: [[23760000000000,62100000000000,61500000000000,15360000000000,83580000000000]] ...
str(kx.q('([] til 3; `a`b`c)'))
'x x1\n----\n0 a \n1 b \n2 c '
repr(kx.q('"this is a char vector"'))
'pykx.CharVector(pykx.q(\'"this is a char vector"\'))'
kx.q('`atom').is_atom
True
kx.q('`not`atom').is_atom
False
print(kx.q('([]10?1f;10?1f)').t)
print(kx.q('`a`b`c!1 2 3').t)
98 99
# q list
qlist = kx.q('(1 2 3;1;"abc")')
list(qlist)
[pykx.LongVector(pykx.q('1 2 3')), pykx.LongAtom(pykx.q('1')), pykx.CharVector(pykx.q('"abc"'))]
Note the difference between this and the conversion of the same qlist
to a true Python representation
qlist.py()
[[1, 2, 3], 1, b'abc']
Numerical comparisons/functions¶
Various q datatypes vectors/atoms/tables can also interact with native Python mathematical comparisons and functions, the following provides an outline of a subset of the comparisons/functions that are supported:
Function | Description |
---|---|
abs |
Absolute value of a number |
< |
Less than |
>= |
Greater than or equal to |
+ |
Addition |
- |
Subtraction |
/ |
Division |
* |
Multiplication |
** |
Power |
% |
Modulo |
Define q/Python atoms and lists for comparisons¶
qlong = kx.q('-5')
pylong = 5
qlist = kx.q('-3+til 5')
pylist = [1, 2, 3, 4, 5]
Apply a number of the above comparisons/functions to python/q objects in combination¶
print(abs(qlong))
print(abs(qlist))
5 3 2 1 0 1
print(qlong>pylong)
print(pylist>qlist)
0b 11111b
print(qlong*pylong)
print(pylist*qlist)
-25 -3 -4 -3 0 5
The raw
q -> Python conversion keyword argument¶
All of the interfaces to the K objects support the raw
keyword argument. When the raw
keyword argument is set to True
the interface forgoes some of the features when converting the object in exchange for greater efficiency.
tab = kx.q('([]10?1f;10?1f;10?0p;10?0Ng)')
tab.pd()
x | x1 | x2 | x3 | |
---|---|---|---|---|
0 | 0.709340 | 0.181136 | 2000-04-05 13:46:49.650548548 | d94ce3f9-6f93-288a-cb81-5eb7ee397d63 |
1 | 0.945220 | 0.232966 | 2002-05-22 18:38:16.360286320 | 0652b585-c955-5cdf-57db-28a8bf6c67f6 |
2 | 0.709242 | 0.250046 | 2002-07-19 14:45:16.888475424 | ee711a5f-3ecc-a92c-aae6-f79638c980f4 |
3 | 0.002184 | 0.073727 | 2000-11-14 15:00:53.104459496 | da70a89d-6417-d4b3-3fc6-e35a1c348c5c |
4 | 0.066705 | 0.318664 | 2002-01-11 13:35:29.320301864 | 3327b1e8-4040-cd34-c49e-587be546e134 |
5 | 0.691834 | 0.187263 | 2001-09-02 06:12:47.997664960 | 2bf1dd85-e832-7955-bd9a-246a0b17fbf8 |
6 | 0.430133 | 0.841629 | 2000-01-29 01:29:19.141581505 | f05fd28a-54a1-959c-997c-6c571ff2a0f3 |
7 | 0.866410 | 0.725071 | 2003-07-04 02:04:27.387096576 | e845f01d-008b-ded0-3383-f19b1d18abfb |
8 | 0.528111 | 0.481804 | 2003-01-11 19:58:49.533226048 | 199b1ac1-6f54-a161-49e7-f7070054626b |
9 | 0.064949 | 0.935131 | 2002-05-31 00:22:29.216298752 | 867fb02f-f294-c3cb-a6da-e15dc8e03a84 |
tab.pd(raw=True)
x | x1 | x2 | x3 | |
---|---|---|---|---|
0 | 0.709340 | 0.181136 | 8257609650548548 | -9.989958e-260+1.764783e+171j |
1 | 0.945220 | 0.232966 | 75407896360286320 | -2.318797e+151-2.305059e+262j |
2 | 0.709242 | 0.250046 | 80405116888475424 | 1.545947e-93-1.538346e+253j |
3 | 0.002184 | 0.073727 | 27529253104459496 | -5.001158e-59+6.559820e+137j |
4 | 0.066705 | 0.318664 | 64071329320301864 | 2.385895e-540+5.636864e-540j |
5 | 0.691834 | 0.187263 | 52726367997664960 | 5.643902e+103-5.861994e+274j |
6 | 0.430133 | 0.841629 | 2424559141581505 | -5.597098e-171-9.478689e+248j |
7 | 0.866410 | 0.725071 | 110599467387096576 | -3.621514e+81-5.157061e+287j |
8 | 0.528111 | 0.481804 | 95630329533226048 | 1.949135e+162+1.882977e+209j |
9 | 0.064949 | 0.935131 | 76119749216298752 | -9.602897e+56-2.758049e-288j |
qvec = kx.q('10?0t')
qvec.np()
array([65372544, 29998169, 39540065, 74409995, 75472099, 12465539, 83125406, 62120318, 34932482, 65141391], dtype='timedelta64[ms]')
qvec.np(raw=True)
array([65372544, 29998169, 39540065, 74409995, 75472099, 12465539, 83125406, 62120318, 34932482, 65141391], dtype=int32)
v = kx.q('12?100')
print(v)
28 17 11 55 51 81 68 96 61 70 70 39
Get the element at index 2
v[2]
pykx.LongAtom(pykx.q('11'))
Retrieve a slice containing elements 3-5
v[3:6]
pykx.LongVector(pykx.q('55 51 81'))
Compare all vector elements to 50
v < 50
pykx.BooleanVector(pykx.q('111000000001b'))
Tables¶
This only applies to in-memory tables
tab = kx.q('([]4?5;4?`2;4?0p;4?0Ng)')
tab.pd()
x | x1 | x2 | x3 | |
---|---|---|---|---|
0 | 3 | mh | 2003-12-13 19:09:53.390793360 | cc610c4d-1a77-8f5a-fc6e-1dc076b7fa4e |
1 | 3 | ko | 2001-08-06 06:41:37.683153304 | 5bb7e970-d360-1e8c-4c26-ff9758daa116 |
2 | 0 | cl | 2002-11-27 03:59:28.103378272 | 0456c72d-3835-63c3-2009-6ded9ea8ec1f |
3 | 1 | ao | 2001-12-14 21:32:15.907946528 | e1e7bb5d-9f2a-b5b3-600b-1cbab5d9db7a |
tab['x1']
pykx.SymbolVector(pykx.q('`mh`ko`cl`ao'))
tab['x2'].py()
[datetime.datetime(2003, 12, 13, 19, 9, 53, 390793), datetime.datetime(2001, 8, 6, 6, 41, 37, 683153), datetime.datetime(2002, 11, 27, 3, 59, 28, 103378), datetime.datetime(2001, 12, 14, 21, 32, 15, 907946)]
Splayed and Partitioned Tables¶
Splayed and Partitioned tables are at present only partially supported. Users will be able to query the data and access information around the columns through the keys
method but will not be able to retrieve the values contained within the data or convert to an analogous Python representation. These will raise a NotImplementedError
.
Research on this is still pending and any changes to support these conversions will be include an update here
Splayed Tables¶
tmp_dir = mkdtemp()
orig_dir = os.getcwd()
os.chdir(tmp_dir)
kx.q('`:db/t/ set ([] a:til 3; b:"xyz"; c:-3?0Ng)')
kx.q(r'\l db')
t_splayed = kx.q('t')
List the columns that are represented in the splayed table
list(t_splayed.keys())
[pykx.SymbolAtom(pykx.q('`a')), pykx.SymbolAtom(pykx.q('`b')), pykx.SymbolAtom(pykx.q('`c'))]
Query the Splayed table
kx.q('?[`t;enlist(=;`a;1);0b;()]')
pykx.Table(pykx.q(' a b c ---------------------------------------- 1 y 27b39298-7aab-d6d2-5423-d62f5931b575 '))
Attempt to evaluate the values method on the table
try:
t_splayed.values()
except NotImplementedError:
print('NotImplementedError was raised', file=sys.stderr)
NotImplementedError was raised
os.chdir(orig_dir)
shutil.rmtree(tmp_dir)
Partitioned Tables¶
tmp_dir = mkdtemp()
orig_dir = os.getcwd()
os.chdir(tmp_dir)
kx.q('`:db/2020.01/t/ set ([] a:til 3; b:"xyz"; c:-3?0Ng)')
kx.q('`:db/2020.02/t/ set ([] a:1+til 3; b:"cat"; c:-3?0Ng)')
kx.q('`:db/2020.03/t/ set ([] a:2+til 3; b:"bat"; c:-3?0Ng)')
kx.q(r'\l db')
t_partitioned = kx.q('t')
print(t_partitioned)
month a b c ------------------------------------------------ 2020.01 0 x 22952bbc-9724-36c5-26e2-4badec1902b3 2020.01 1 y dddf93ed-2e6a-4586-dcff-5168e939f7a4 2020.01 2 z 161527d4-c8aa-0d51-4a14-020608c28bc4 2020.02 1 c f349b2c2-f0c3-6463-4fef-d8718098a969 2020.02 2 a bae989a6-0210-cb99-d2cc-34a20572151e 2020.02 3 t d1371d82-a1b6-ddbe-a427-0ae2696dc4c1 2020.03 2 b 8dce6f11-7539-98ff-4b2b-2d5f3819a3e9 2020.03 3 a 643d8bb9-2d8e-cecd-eccc-28b2ccb88006 2020.03 4 t 08c0000f-4165-ab2a-0e9f-99b82fa4d2de
List partitioned table columns
list(t_partitioned.keys())
[pykx.SymbolAtom(pykx.q('`a')), pykx.SymbolAtom(pykx.q('`b')), pykx.SymbolAtom(pykx.q('`c'))]
Query partitioned table
kx.q('?[`t;enlist(=;`a;1);0b;enlist[`c]!enlist`c]')
pykx.Table(pykx.q(' c ------------------------------------ dddf93ed-2e6a-4586-dcff-5168e939f7a4 f349b2c2-f0c3-6463-4fef-d8718098a969 '))
Attempt to convert partitioned table to a pandas dataframe
try:
t_partitioned.pd()
except NotImplementedError:
pass
os.chdir(orig_dir)
shutil.rmtree(tmp_dir)
q Functions¶
All functions defined in q can be called from PyKX via function objects. These function calls can take Python or q objects as input arguments. It is required that each argument being supplied to the function be convertible to a q representation using kx.K(arg)
.
Arguments can be provided either positionally, or as keyword arguments when the q function has named parameters.
f = kx.q('{x*y+z}')
f(12, 2, 1)
pykx.LongAtom(pykx.q('36'))
f(12, 2, 1).py()
36
g = kx.q('{[arg1;arg2] deltas sum each arg1 cross til arg2}')
g(arg2=7, arg1=kx.q('3?45')).np()
array([ 6, 1, 1, 1, 1, 1, 1, -9, 1, 1, 1, 1, 1, 1, 26, 1, 1, 1, 1, 1, 1])
tok = kx.q("$'")
print(repr(tok))
print(str(tok))
pykx.Each(pykx.q('$'')) $'
tok(kx.q('"B"'), kx.q('" ",.Q.an')).np()
array([False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, True, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, True, True, False, False, False, True, False, False, False, False, False, False, False, False])
Context Interface¶
The context interface provides a convenient way to interact with q contexts and namespaces using either the embedded q instance pykx.q
or an IPC connection made with pykx.QConnection
.
Accessing an attribute which is not defined via the context interface, but which corresponds to a script (i.e. a .q
or .k
file), will cause it to be loaded automatically. Scripts are search for if they are:
- In the same directory as the process running PyKX
- In
QHOME
Other paths can be searched for by appending them to kx.q.paths
. Alternatively, you can manually load a script with kx.q.ctx._register
.
Functions which are registered via the context interface are automatically added as callable members of their QContext
.
Builtin namespaces¶
As a result of the infrastructure outlined above there are a number of namespaces which are automatically added as extensions to the q base class on loading. This includes the .q
, .z
, .Q
and .j
namespaces contained within kx.q.k
, the following provides some example invocations of each.
A number of the functions contained within the .z namespace are not callable, including but not limited to the following:
- .z.ts
- .z.ex
- .z.ey
Run dir(kx.q.z)
to see what is available in the .z
namespace.
.q functionality¶
All the functions a user would expect to be exposed from q are callable as python methods off the q base class, the following provides a limited number of example invocations
print(kx.q.til(10))
0 1 2 3 4 5 6 7 8 9
print(kx.q.max([100, 2, 3, -4]))
100
print(kx.q.mavg(4, kx.q.til(10)))
0 0.5 1 1.5 2.5 3.5 4.5 5.5 6.5 7.5
print(kx.q.tables())
,`t
s = kx.q('([]a:1 2;b:2 3;c:5 7)')
s
pykx.Table(pykx.q(' a b c ----- 1 2 5 2 3 7 '))
t = kx.q('([]a:1 2 3;b:2 3 7;c:10 20 30;d:"ABC")').pd()
t
a | b | c | d | |
---|---|---|---|---|
0 | 1 | 2 | 10 | b'A' |
1 | 2 | 3 | 20 | b'B' |
2 | 3 | 7 | 30 | b'C' |
print(kx.q.uj(s,t))
a b c d -------- 1 2 5 2 3 7 1 2 10 A 2 3 20 B 3 7 30 C
.Q
namespace¶
The functions within the .Q
namespace are also exposed as an extension.
Note: While all functions within the .Q
namespace are available, compared to the .q
/.z
namespaces these functions can be complicated to implement within the constraints of the PyKX interface for example .Q.dpft
can be implemented but requires some thought
kx.q.Q
<pykx.ctx.QContext of .Q with [ajf0, k, K, host, addr, gc, ts, gz, w, res, addmonths, Xf, Cf, f, fmt, pykxld, ff, fl, opt, def, ld, qt, v, qp, V, ft, ord, nv, tx, tt, fk, t, ty, nct, fu, fc, A, a, n, nA, an, b6, Aa, unm, id, j10, x10, j12, x12, btoa, sha1, prf0, objp, lo, l, sw, tab, t0, s1, s2, S, s, hap, hmb, hg, hp, a1, a0, IN, qa, qb, vt, bv, pm, pt, MAP, dd, d0, p1, p2, p, view, jp, rp, L, cn, pcnt, dt, ind, fp, foo, a2, qd, xy, x1, x0, x2, ua, q0, qe, ps, enxs, enx, en, ens, par, dpts, dpt, dpfts, dpft, hdpf, fsn, fs, fpn, fps, dsftg, M, chk, Ll, Lp, Lx, Lu, Ls, fqk, fql, btx, bt, sbt, trp, dr, dw, pl0, pl, jl8, srr, prr, lu, DL, dbg, err, BP, bp, bs, bu, bd, bc, x, D, d, PV, PD, pf, pd, pv, u, pn]>
kx.q.Q.an
pykx.CharVector(pykx.q('"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_0123456789"'))
kx.q.Q.btoa(b'Hello World!')
pykx.CharVector(pykx.q('"SGVsbG8gV29ybGQh"'))
t = kx.q('([]a:3 4 5;b:"abc";c:(2;3.4 3.2;"ab"))')
kx.q.each(kx.q.Q.ty, t['a','b','c'])
pykx.List(pykx.q(''))
.j
namespace¶
json = b'{"x":1, "y":"test"}'
qdict = kx.q.j.k(json)
print(qdict)
x| 1f y| "test"
kx.q.j.j(qdict).py()
b'{"x":1,"y":"test"}'
User defined extensions¶
As alluded to above users can add their own extension modules to PyKX by placing a relevant .q
/.k
to their $QHOME
. The following shows the addition of an extension to complete a specific query and set some data which we would like to be available.
Extension Example¶
The following example we will create (and later delete) the file '$QHOME/demo_extension.q'
demo_extension_source = '''
\d .demo_extension
N:100
test_data:([]N?`a`b`c;N?1f;N?10;N?0b)
test_function:{[data]
analytic_keys :`max_x1`avg_x2`med_x3;
analytic_calcs:(
(max;`x1);
(avg;`x2);
(med;`x3));
?[data;
();
k!k:enlist `x;
analytic_keys!analytic_calcs
]
}
'''
demo_extension_filename = kx.qhome/'demo_extension.q'
with open(demo_extension_filename, 'w') as f:
f.write(demo_extension_source)
print(kx.q.demo_extension.test_data)
x x1 x2 x3 ----------------- c 0.3663844 2 0 a 0.8598177 4 1 b 0.8000145 5 1 a 0.7978511 5 0 a 0.6615992 6 0 c 0.7920702 9 0 c 0.3529016 2 0 a 0.437764 6 1 c 0.7263137 9 0 a 0.9462631 4 0 b 0.6418491 0 0 b 0.9275927 4 0 c 0.1331516 5 1 a 0.8788179 6 0 b 0.9756565 6 1 b 0.1886663 0 1 a 0.1096443 1 1 c 0.1805561 4 1 c 0.5187677 7 1 b 0.8834092 6 1 ..
kx.q.demo_extension.test_function
pykx.SymbolicFunction(pykx.q('`.demo_extension.test_function'))
print(kx.q.demo_extension.test_function(kx.q.demo_extension.test_data))
x| max_x1 avg_x2 med_x3 -| ------------------------- a| 0.9986046 5 0 b| 0.9879205 4.272727 1 c| 0.9846962 5.514286 0
os.remove(demo_extension_filename)
Querying Interface¶
One of the core purposes of this module is to provide users who are unfamiliar with q with a Pythonic approaches to interacting with q objects.
One of the ways this is intended to be achieved is to provide Pythonic wrappers around common q tasks in a way that feels familiar to a Python developer but is still efficient/flexible.
The querying interface is an example of this. It provides a wrapper around the q functional select syntax to facilitate the querying of persisted and local data while also allowing Python objects to be used as inputs where it is relevant.
help is provided¶
Users can use the Python help
function to display the docstring associated with each of the functions within the query
module
# help(kx.q.qsql)
# help(kx.q.qsql.select)
# help(kx.q.qsql.exec)
# help(kx.q.qsql.update)
# help(kx.q.qsql.delete)
Select functionality¶
The select functionality is provided both as an individually callable function or as a method off all tabular data.
Generate a table and assign the Python object as a named entity within the q memory space.
qtab = kx.q('([]col1:100?`a`b`c;col2:100?1f;col3:100?5)')
kx.q['qtab'] = qtab
Retrieve the entirety of the table using an empty select
print(kx.q.qsql.select(qtab))
col1 col2 col3 -------------------- b 0.5707861 0 b 0.7758173 2 a 0.9687589 1 b 0.1563845 0 c 0.4655548 0 c 0.8455166 2 b 0.7281041 3 a 0.7403385 0 b 0.5199511 0 b 0.199172 2 c 0.9548708 4 c 0.4981119 1 b 0.08997612 3 b 0.2274166 1 a 0.86544 2 a 0.3112134 4 c 0.3520122 3 b 0.4485896 1 b 0.6742543 3 b 0.2357538 1 ..
Retrieve the entire table using the module function
print(kx.q.qsql.select(qtab))
col1 col2 col3 -------------------- b 0.5707861 0 b 0.7758173 2 a 0.9687589 1 b 0.1563845 0 c 0.4655548 0 c 0.8455166 2 b 0.7281041 3 a 0.7403385 0 b 0.5199511 0 b 0.199172 2 c 0.9548708 4 c 0.4981119 1 b 0.08997612 3 b 0.2274166 1 a 0.86544 2 a 0.3112134 4 c 0.3520122 3 b 0.4485896 1 b 0.6742543 3 b 0.2357538 1 ..
Retrieve the entire table based on a named reference
This is important because it provides a method of querying partitioned/splayed tables
print(kx.q.qsql.select('qtab'))
col1 col2 col3 -------------------- b 0.5707861 0 b 0.7758173 2 a 0.9687589 1 b 0.1563845 0 c 0.4655548 0 c 0.8455166 2 b 0.7281041 3 a 0.7403385 0 b 0.5199511 0 b 0.199172 2 c 0.9548708 4 c 0.4981119 1 b 0.08997612 3 b 0.2274166 1 a 0.86544 2 a 0.3112134 4 c 0.3520122 3 b 0.4485896 1 b 0.6742543 3 b 0.2357538 1 ..
The where keyword
Where clauses can be provided as a named keyword and are expected to be formatted as an individual string or a list of strings as in the following examples.
By default no where conditions are applied to a select query
# print(kx.q.qsql.select(qtab, where='col1=`a'))
print(kx.q.qsql.select(qtab, where=['col3<0.5', 'col2>0.7']))
col1 col2 col3 ------------------- a 0.7403385 0 b 0.8153197 0 b 0.7902208 0 a 0.7276113 0
The columns keyword
The columns keyword is used to apply analytics to specific columns of the data or to select and rename columns within the dataset.
By default if a user does not provide this information it is assumed that all columns are to be returned without modification.
The columns keyword is expected to be a dictionary mapping the name that the new table will display for the column to the logic with which this data is modified.
kx.q.qsql.select(qtab, columns={'col1': 'col1','newname': 'col2'})
pykx.Table(pykx.q(' col1 newname --------------- b 0.5707861 b 0.7758173 a 0.9687589 b 0.1563845 c 0.4655548 c 0.8455166 b 0.7281041 a 0.7403385 b 0.5199511 b 0.199172 c 0.9548708 c 0.4981119 b 0.08997612 b 0.2274166 a 0.86544 a 0.3112134 c 0.3520122 b 0.4485896 b 0.6742543 b 0.2357538 .. '))
kx.q.qsql.select(qtab, columns={'max_col2': 'max col2'}, where='col1=`a')
pykx.Table(pykx.q(' max_col2 --------- 0.9687589 '))
The by keyword
The by keyword is used to apply analytics to group data based on common characteristics.
By default if a user does not provide this information it is assumed that no grouping ins applied.
The by keyword is expected to be a dictionary mapping the name to be applied to the by clause of the grouping to the column of the original table which is being used for the grouping.
kx.q.qsql.select(
qtab,
columns={'minCol2': 'min col2', 'medCol3': 'med col3'},
by={'groupCol1': 'col1'},
where=['col3<0.5', 'col2>0.7']
)
pykx.KeyedTable(pykx.q(' groupCol1| minCol2 medCol3 ---------| ----------------- a | 0.7276113 0 b | 0.7902208 0 '))
Delete functionality¶
The delete functionality is provided both as an individually callable function or as a method off all tabular data.
The following provides a outline of how this can be invoked in both cases.
Note: By default the delete functionality does not modify the underlying representation of the table. This is possible under limited circumstances as is outline in a later section below.
print(kx.q.qsql.delete(qtab))
print(kx.q.qsql.delete('qtab'))
col1 col2 col3 -------------- col1 col2 col3 --------------
The columns keyword
The columns keyword is used to denote the columns that are to be deleted from a table.
By default if a user does not provide this information it is assumed that all columns are to be deleted.
The columns keyword is expected to be a string or list of strings denoting the columns to be deleted.
Note: The columns and where clause can not be used in the same function call, this is not supported by the underlying functional delete.
# print(kx.q.qsql.delete(qtab, columns = 'col3'))
print(kx.q.qsql.delete(qtab, columns = ['col1','col2']))
col3 ---- 0 2 1 0 0 2 3 0 0 2 4 1 3 1 2 4 3 1 3 1 ..
The where keyword
The where keyword is used to filter rows of the data to be deleted.
By default if no where condition is supplied it is assumed that all rows of the dataset are to be deleted.
The where keyword is expected when not default to be a string on which to apply the filtering
Note: The columns and where clause can not be used in the same function call, this is not supported by the underlying functional delete.
print(kx.q.qsql.delete(qtab, where='col1 in `a`b'))
col1 col2 col3 -------------------- c 0.4655548 0 c 0.8455166 2 c 0.9548708 4 c 0.4981119 1 c 0.3520122 3 c 0.7824787 2 c 0.2080171 2 c 0.446898 3 c 0.6990336 3 c 0.4418975 4 c 0.7932503 1 c 0.6648273 3 c 0.3791373 0 c 0.6217243 3 c 0.5569152 2 c 0.3877172 1 c 0.7258795 2 c 0.03947309 4 c 0.1404332 3 c 0.6829453 4 ..
The modify keyword
The modify keyword is used when the user intends for the underlying representation of a named entity within the q memory space to be modified. This is only applicable when calling the function via the kx.q.qsql.delete
representation of the function.
By default the underlying representation is not modified with modify=False
in order to change the underlying representation a user must set modify=True
kx.q.qsql.delete('qtab', where = 'col1=`c', modify=True)
pykx.SymbolAtom(pykx.q('`qtab'))
print(kx.q('qtab'))
col1 col2 col3 -------------------- b 0.5707861 0 b 0.7758173 2 a 0.9687589 1 b 0.1563845 0 b 0.7281041 3 a 0.7403385 0 b 0.5199511 0 b 0.199172 2 b 0.08997612 3 b 0.2274166 1 a 0.86544 2 a 0.3112134 4 b 0.4485896 1 b 0.6742543 3 b 0.2357538 1 b 0.7589261 4 b 0.3186598 4 a 0.573785 3 a 0.1137676 3 b 0.7053699 1 ..
Update and exec functionality¶
Both the q functional update and exec functionality are supported by this interface. For brevity they are not shown in the same detail as the previous examples
# kx.q.qsql.exec(qtab, 'col1')
# kx.q.qsql.exec(qtab, columns='col2', by='col1')
kx.q.qsql.exec(qtab, columns={'avgCol3': 'avg col3'}, by='col1')
pykx.Dictionary(pykx.q(' | avgCol3 -| -------- a| 1.8 b| 1.888889 c| 2.461538 '))
# print(kx.q.qsql.update({'avg_col2':'avg col2'}, by={'col1': 'col1'}))
# print(kx.q.qsql.update({'col3':100}, where='col1=`a'))
kx.q.qsql.update('qtab', {'col2': 4.2}, 'col1=`b', modify=True)
print(kx.q['qtab'])
col1 col2 col3 ------------------- b 4.2 0 b 4.2 2 a 0.9687589 1 b 4.2 0 b 4.2 3 a 0.7403385 0 b 4.2 0 b 4.2 2 b 4.2 3 b 4.2 1 a 0.86544 2 a 0.3112134 4 b 4.2 1 b 4.2 3 b 4.2 1 b 4.2 4 b 4.2 4 a 0.573785 3 a 0.1137676 3 b 4.2 1 ..
Establishing a Connection¶
Connections to external q processes are established using the pykx.QConnection
class. On initialization the instance of this class will establish a connection to the specified q process using the provided connection information (e.g. host
, port
, username
, password
, etc.). Refer to the PyKX IPC module documentation for more details about this interface, or run help(pykx.QConnection)
.
IPC Example¶
The following is a basic example of this functionality a more complex subscriber/publisher example is provided in examples/ipc/
This example will work in the presence or absence of a valid q license
Create the external q process¶
To run this example, the Python code in the following cell will do the equivalent to executing the following in a terminal:
$ q -p 5000
q)tab:([]100?`a`b`c;100?1f;100?0Ng)
q).z.ps:{[x]0N!(`.z.ps;x);value x}
q).z.pg:{[x]0N!(`.z.pg;x);value x}
import subprocess
import time
proc = subprocess.Popen(
('q', '-p', '5000'),
stdin=subprocess.PIPE,
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
)
proc.stdin.write(b'tab:([]100?`a`b`c;100?1f;100?0Ng)\n')
proc.stdin.write(b'.z.ps:{[x]0N!(`.z.ps;x);value x}\n')
proc.stdin.write(b'.z.pg:{[x]0N!(`.z.pg;x);value x}\n')
proc.stdin.flush()
time.sleep(2)
Open a connection to this process¶
# Normally a `with` block would be used for proper context management, but for the sake of this example the connection will be accessed and closed directly
conn = kx.QConnection('localhost', 5000)
Make a simple synchronous request¶
qvec = conn('2+til 2')
qvec
pykx.LongVector(pykx.q('2 3'))
Make a simple asynchronous request¶
conn('setVec::10?1f', wait=False)
setVec = conn('setVec')
setVec
pykx.FloatVector(pykx.q('0.2144001 0.820994 0.07424075 0.8202035 0.6618763 0.9585253 0.579547 0.705332..'))
Run a defined function server side with provided arguments¶
pytab = pd.DataFrame({'col1': [1, 2, 3], 'col2': [4, 5, 6]})
conn('{[table;column;rows]rows#column#table}', pytab, ['col1'], 1).pd()
col1 | |
---|---|
0 | 1 |
conn('{[table;column]newtab::table column}', pytab, 'col1', wait=False)
pykx.Identity(pykx.q('::'))
conn('newtab').np()
array([1, 2, 3])
Disconnect from the q process¶
conn.close()
# This happens automatically when you leave a `with` block that is managing a connection, or when a connection is garbage-collected.
# Shutdown the q process we were connected to for the IPC demo
proc.stdin.close()
proc.kill()