Create and convert PyKX objects
This page provides details on how to generate and convert PyKX objects.
Tip: For the best experience, we recommend reading PyKX objects and attributes first.
To use the power of q and the functionality provided by PyKX, at some point you must interact with PyKX objects. At their most basic level, objects are allocated C representations of q/kdb+ objects within a memory space managed by q. Keeping the data in this format allows it to be used directly for query/analytic execution in q without any translation overhead.
1. Create PyKX objects
There are five ways to create PyKX objects:
- a. Convert Python objects to PyKX objects
- b. Generate data using PyKX inbuilt functions
- c. Evaluate q code using
kx.q
- d. Retrieve a named entity from q's memory
- e. Query an external q session
1.a Convert Python objects to PyKX objects
The simplest way to create a PyKX object is by converting a similar Python type into a PyKX object. You can do this with the pykx.toq function
, which supports conversions from Python, NumPy, pandas, and PyArrow types to PyKX objects. Open the tabs that interest you to see conversion examples:
Specify target types
When converting Pythonic objects to PyKX types, you can use the ktype
named argument:
- To convert lists/atomic elements, use PyKX types;
- To convert Pandas DataFrames or PyArrow Tables, use the
ktype
argument with a dictionary input mapping the column name to the PyKX type.
>>> import pykx as kx
>>> pyatom = 2
>>> pylist = [1, 2, 3]
>>> pydict = {'x': [1, 2, 3], 'y': {'x': 3}}
>>>
>>> kx.toq(pyatom)
pykx.LongAtom(pykx.q('2'))
>>> kx.toq(pylist)
pykx.LongVector(pykx.q('1 2 3'))
>>> kx.toq(pylist, kx.FloatVector)
pykx.FloatVector(pykx.q('1 2 3f'))
>>> kx.toq(pydict)
pykx.Dictionary(pykx.q('
x| (1;2;3)
y| (,`x)!,3
'))
>>> import pykx as kx
>>> import numpy as np
>>> nparray1 = np.array([1, 2, 3])
>>> nparray2 = np.array(['2007-07-13', '2006-01-13', '2010-08-13'], dtype='datetime64')
>>> nparray3 = np.array([[1, 2, 3], [4, 5, 6]], np.int32)
>>>
>>> kx.toq(nparray1)
pykx.LongVector(pykx.q('1 2 3'))
>>> kx.toq(nparray1, kx.FloatVector)
pykx.FloatVector(pykx.q('1 2 3f'))
>>> kx.toq(nparray2)
pykx.DateVector(pykx.q('2007.07.13 2006.01.13 2010.08.13'))
>>> kx.toq(nparray3)
pykx.List(pykx.q('
1 2 3
4 5 6
'))
>>> import pykx as kx
>>> import pandas as pd
>>> import numpy as np
>>> pdseries1 = pd.Series([1, 2, 3])
>>> pdseries2 = pd.Series([1, 2, 3], dtype=np.int32)
>>> df = pd.DataFrame.from_dict({'x': [1, 2], 'y': ['a', 'b']})
>>> kx.toq(pdseries1)
pykx.LongVector(pykx.q('1 2 3'))
>>> kx.toq(pdseries1, kx.FloatVector)
pykx.FloatVector(pykx.q('1 2 3f'))
>>> kx.toq(pdseries2)
pykx.IntVector(pykx.q('1 2 3i'))
>>> kx.toq(df)
pykx.Table(pykx.q('
x y
---
1 a
2 b
'))
>>> kx.toq(df).dtypes
pykx.Table(pykx.q('
columns type
-----------------------
x "kx.LongAtom"
y "kx.SymbolAtom"
'))
>>> kx.toq(df, ktype={'x': kx.FloatVector}).dtypes
pykx.Table(pykx.q('
columns type
-----------------------
x "kx.FloatAtom"
y "kx.SymbolAtom"
'))
>>> import pykx as kx
>>> import pyarrow as pa
>>> arr = pa.array([1, 2, None, 3])
>>> nested_arr = pa.array([[], None, [1, 2], [None, 1]])
>>> dict_arr = pa.array([{'x': 1, 'y': True}, {'z': 3.4, 'x': 4}])
>>> kx.toq(arr)
pykx.FloatVector(pykx.q('1 2 0n 3'))
>>> kx.toq(nested_arr)
pykx.List(pykx.q('
`float$()
::
1 2f
0n 1
'))
>>> kx.toq(dict_arr)
pykx.List(pykx.q('
x y z
--------
1 1b ::
4 :: 3.4
'))
>>>
>>> n_legs = pa.array([2, 4, 5, 100])
>>> animals = pa.array(["Flamingo", "Horse", "Brittle stars", "Centipede"])
>>> names = ["n_legs", "animals"]
>>> tab = pa.Table.from_arrays([n_legs, animals], names=names)
>>> kx.toq(tab)
pykx.Table(pykx.q('
n_legs animals
--------------------
2 Flamingo
4 Horse
5 Brittle stars
100 Centipede
'))
>>> kx.toq(tab).dtypes
pykx.Table(pykx.q('
columns type
-----------------------
n_legs "kx.LongAtom"
animals "kx.SymbolAtom"
'))
>>> kx.toq(tab, {'animals': kx.CharVector}).dtypes
pykx.Table(pykx.q('
columns type
-----------------------
n_legs "kx.LongAtom"
animals "kx.CharVector"
'))
By default, when you convert Python strings to PyKX, they are returned as pykx.SymbolAtom
objects. This ensures a clear distinction between str
(string) and byte
objects. However, you might prefer Python strings to be returned as pykx.CharVector
objects, to achieve memory efficiency or greater flexibility in analytic development. To do this, use the keyword argument strings_as_char
, which ensures that all str
objects are converted to pykx.CharVector
objects.
>>> import pykx as kx
>>> kx.toq('str', strings_as_char=True)
pykx.CharVector(pykx.q('"str"'))
>>> kx.toq({'a': {'b': 'test'}, 'b': 'test1'}, strings_as_char=True)
pykx.Dictionary(pykx.q('
a| (,`b)!,"test"
b| "test1"
'))
1.b Generate data using PyKX inbuilt functions
For users who want to generate objects directly but are not familiar with q, and wish to quickly prototype this functionality, several helper functions are available.
Create a vector of random floating point precision values:
>>> kx.random.random(3, 10.0)
pykx.FloatVector(pykx.q('9.030751 7.750292 3.869818'))
Additionally, when generating random data, you can use PyKX null/infinite data to create data across larger data ranges as follows:
>>> kx.random.random(2, kx.GUIDAtom.null)
pykx.GUIDVector(pykx.q('8c6b8b64-6815-6084-0a3e-178401251b68 5ae7962d-49f2-404d-5aec-f7c8abbae288'))
>>> kx.random.random(3, kx.IntAtom.inf)
pykx.IntVector(pykx.q('986388794 824432196 2022020141i'))
Create a two-dimensional list of random symbol values:
>>> kx.random.random([2, 3], ['a', 'b', 'c'])
pykx.List(pykx.q('
b b c
b a b
'))
Create a table of tabular data generated using random data:
>>> N = 100000
>>> table = kx.Table(
... data = {'sym': kx.random.random(N, ['AAPL', 'MSFT']),
... 'price': kx.random.random(N, 100.0),
... 'size': 1+kx.random.random(N, 100)})
>>> table.head()
pykx.Table(pykx.q('
sym price size
------------------
MSFT 49.34749 50
MSFT 23.31342 96
AAPL 63.1368 36
AAPL 98.71169 7
AAPL 68.98055 94
'))
For retrieval of current temporal information, call the date
, time
, and timestamp
type objects as follows:
>>> kx.DateAtom('today')
pykx.DateAtom(pykx.q('2024.01.05'))
>>> kx.TimeAtom('now')
pykx.TimeAtom(pykx.q('16:22:12.178'))
>>> kx.TimestampAtom('now')
pykx.TimestampAtom(pykx.q('2024.01.05T16:22:21.012631000'))
1.c Evaluate q code using kx.q
If you're more familiar with q, generate PyKX objects by evaluating q code:
>>> kx.q('til 10')
pykx.LongVector(pykx.q('0 1 2 3 4 5 6 7 8 9'))
Documentation guide on how to use kx.q
.
1.d Retrieve a named entity from q's memory
As PyKX objects exist in a memory space accessed and controlled by interactions with q, the items created in q may not be immediately available as Python objects. For example, if you created a named variable in q as a side effect of a function call or just explicitly created it, you can retrieve it by its name:
>>> kx.q('t:([]5?1f;5?1f)') # Generate a named variable in a single object
pykx.Identity(pykx.q('::'))
>>> kx.q('{k::5?1f;k*x}',2) # Generate a global variable k as a side effect
pykx.FloatVector(pykx.q('0.7855048 1.034182 1.031959 0.8133284 0.3561677'))
>>> kx.q['t']
pykx.Table(pykx.q('
x x1
--------------------
0.4931835 0.3017723
0.5785203 0.785033
0.08388858 0.5347096
0.1959907 0.7111716
0.375638 0.411597
'))
>>> kx.q['k']
pykx.FloatVector(pykx.q('0.3927524 0.5170911 0.5159796 0.4066642 0.1780839'))
1.e Query an external q session
PyKX provides an IPC interface allowing users to query and retrieve data from a q server. If you have a q server with no username/password exposed on port 5000
, it's possible to run synchronous and asynchronous events against this server:
>>> conn = kx.QConnection('localhost', 5000) # Open a connection to the q server
>>> conn('til 10') # Execute a command server side
pykx.LongVector(pykx.q('0 1 2 3 4 5 6 7 8 9'))
>>> conn['tab'] = kx.q('([]100?`a`b;100?1f;100?1f)') # Generate a table on the server
>>> conn.qsql.select('tab', where = 'x=`a') # Query using qsql statement
pykx.Table(pykx.q('
x x1 x2
-----------------------
a 0.481804 0.8112026
a 0.4301331 0.04881728
a 0.8664098 0.9006991
a 0.5281112 0.8505909
a 0.06494865 0.8196014
a 0.5464707 0.8187707
a 0.9601549 0.6919292
a 0.9256041 0.340393
a 0.8276669 0.9456963
a 0.5930176 0.8649262
a 0.4746581 0.3114364
a 0.8608133 0.8132478
a 0.01668426 0.274227
a 0.707851 0.2439194
a 0.7632325 0.6568734
a 0.927445 0.9625156
a 0.1247049 0.3714973
a 0.3992327 0.3550381
a 0.7263287 0.3615143
a 0.02810674 0.481821
..
'))
2. Convert PyKX objects to Pythonic types
Converting data to a PyKX format allows for easy interaction with these objects using q or the analytic functionality provided by PyKX. However, this format may not be suitable for all use cases. For instance, if a function requires a Pandas DataFrame as input, a PyKX object must be converted to a Pandas DataFrame.
Once the data is ready for use in Python, it may be more appropriate to convert it into a representation using Python, NumPy, Pandas, or PyArrow by using the following methods:
Method | Description |
---|---|
*.py() |
Convert a PyKX object to Python |
*.np() |
Convert a PyKX object to Numpy |
*.pd() |
Convert a PyKX object to Pandas |
*.pa() |
Convert a PyKX object to PyArrow |
Example
import pykx as kx
qarr = kx.q('til 5')
>>> qarr.py()
[0, 1, 2, 3, 4]
>>> qarr.np()
array([0, 1, 2, 3, 4])
>>> qarr.pd()
0 0
1 1
2 2
3 3
4 4
dtype: int64
>>> qarr.pa()
<pyarrow.lib.Int64Array object at 0x7ffabf2f4fa0>
[
0,
1,
2,
3,
4
]
>>>
>>> qtab = kx.Table(data={
... 'x': kx.random.random(5, 1.0),
... 'x1': kx.random.random(5, 1.0),
... })
>>> qtab
pykx.Table(pykx.q('
x x1
-------------------
0.439081 0.4707883
0.5759051 0.6346716
0.5919004 0.9672398
0.8481567 0.2306385
0.389056 0.949975
'))
>>> qtab.np()
rec.array([(0.43908099, 0.47078825), (0.57590514, 0.63467162),
(0.59190043, 0.96723983), (0.84815665, 0.23063848),
(0.38905602, 0.94997503)],
dtype=[('x', '<f8'), ('x1', '<f8')])
>>> qtab.pd()
x x1
0 0.439081 0.470788
1 0.575905 0.634672
2 0.591900 0.967240
3 0.848157 0.230638
4 0.389056 0.949975
>>> qtab.pa()
pyarrow.Table
x: double
x1: double
Precision loss considerations
Special care is needed when converting q temporal data to Python native data types. Since Python temporal data types only support microsecond precision, roundtrip conversions reduce the temporal granularity of q data.
>>> import pykx as kx
>>> qtime = kx.TimestampAtom('now')
>>> qtime
pykx.TimestampAtom(pykx.q('2024.01.05D03:16:23.736627552'))
>>> kx.toq(qtime.py())
pykx.TimestampAtom(pykx.q('2024.01.05D03:16:23.736627000'))
See our Conversion considerations for temporal types section for further details.