{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Interface Overview\n", "The purpose of this notebook is to provide a demonstration of the capabilities of PyKX for users who are familiar with q.\n", "\n", "To follow along please download this notebook using the following 'link.'\n", "\n", "This demonstration will outline the following\n", "\n", "1. [Initializing the library](#initializing-the-library)\n", "2. [Generating q objects](#creating-q-objects-from-python-objects)\n", "3. [Converting q to Python](#converting-q-to-python)\n", "4. [Interacting with q objects](#k-object-properties-and-methods)\n", "5. [Context Interface](#context-interface)\n", "6. [Querying Interface](#querying-interface)\n", "7. [IPC communication](#ipc-communication)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Initializing the library" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Non-PyKX Requirements\n", "\n", "For the purpose of this demonstration the following Python libraries/modules are required" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "import shutil\n", "import sys\n", "from tempfile import mkdtemp\n", "\n", "import numpy as np\n", "import pandas as pd\n", "import pyarrow as pa" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Initialization\n", "\n", "Once installed via pip, PyKX can be started by importing the module. This will initialize embedded q within the Python process if a valid q license is found (e.g. in `$QHOME` or `$QLIC`), or fall back to the unlicensed version if no such license is found. This notebook will use the licensed version of PyKX. To force the usage of the unlicensed version (and silence the warning that is raised when the fallback to the unlicensed version is employed) you can add `--unlicensed` to the environment variable `$QARGS`. `$QARGS` can be set to a string of arguments which will be used to initialize the embedded q instance, as if you had used those arguments to start q from the command line." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import warnings\n", "warnings.filterwarnings('ignore') # Do not copy, as we are skipping symlinking pyKX to QHOME the core insights libraries will not be copied over and will raise warnings\n", "os.environ['IGNORE_QHOME'] = '1' # Ignore symlinking PyKX q libraries to QHOME \n", "os.environ['PYKX_Q_LOADED_MARKER'] = '' # Only used here for running Notebook under mkdocs-jupyter during document generation.\n", "import pykx as kx\n", "kx.q.system.console_size = [10, 80]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Evaluating q code using embedded q" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kx.q('1+1')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kx.q('1 2 3 4f')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kx.q('([]2?1f;2?0Ng;2?0b)')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kx.q('`a`b`c!(til 10;`a`b`c;5?\"abc\")')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating q objects from Python objects\n", "\n", "One of the strengths of the PyKX interface is the flexibility in the representations of objects that can be converted from a native Python representation to a q equivalent.\n", "\n", "By default data formatted in Python using the following libraries can be converted to a q equivalent representation.\n", "\n", "* python native types\n", "* numpy\n", "* pandas\n", "* pyarrow\n", "\n", "These are all facilitated through use of the `K` method of the base `q` class shown before as follows\n", "\n", "#### Atomic Structures" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pyAtomic = 1.5\n", "npAtomic = np.float64(1.5)\n", "pdAtomic = pd.Series([1.5])\n", "paAtomic = pa.array([1.5])\n", "\n", "print(kx.K(pyAtomic))\n", "# print(kx.K(npAtomic))\n", "# print(kx.K(pdAtomic))\n", "# print(kx.K(paAtomic))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Array/Series Structures" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pyArray = [1, 2.5, \"abc\", b'defg']\n", "npArray = np.array([1, 2.5, \"abc\", b'defg'], dtype = object)\n", "pdSeries = pd.Series([pyArray])\n", "paArray = pa.array([1, 2, 3])\n", "\n", "print(kx.K(pyArray))\n", "# print(kx.K(npArray))\n", "# print(kx.K(pdSeries))\n", "# print(kx.K(paArray))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Tabular data\n", "Round trip support for tabular data is presently supported for Pandas Dataframes and PyArrow tables" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pdtable = pd.DataFrame({'col1': [1, 2],\n", " 'col2': [2., 3.],\n", " 'col3': ['Hello', 'World']})\n", "patable = pa.Table.from_pandas(pdtable)\n", "\n", "display(kx.K(pdtable))\n", "# display(kx.K(patable))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Converting q to Python\n", "All K objects support one or more of the following methods: `py()`, `np()`, `pd()` or `pa()`\n", "\n", "These methods provide an interface to the K object such that they can be converted to an analogous Python, Numpy, Pandas or PyArrow object respectively. \n", "\n", "Whether the view is a copy or not varies:\n", "\n", "1. The 'py' property always provides a copy.\n", "2. The 'np' property does not copy unless the data cannot be interpreted by Numpy properly without changing it. For example, all temporal types in Numpy take 64 bits per item, so the 32 bit q temporal types must be copied to be represented as Numpy 'datetime64'/'timedelta64' elements. In cases where copying is unacceptable, the raw keyword argument can be set to true as demonstrated below.\n", "3. The 'pd' property leverages the 'np' property to create Pandas objects, as such the same restrictions apply to it.\n", "4. The 'pa' property leverages the 'pd' property to create PyArrow objects, as such the same restrictions apply to it.\n", "\n", "### Atomic Conversions\n", "Define q items for conversion" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "qbool = kx.q('0b')\n", "qguid = kx.q('\"G\"$\"00000000-0000-0000-0000-000000000001\"')\n", "qreal = kx.q('1.5e')\n", "qlong = kx.q('1234')\n", "qsymb = kx.q('`test')\n", "qchar = kx.q('\"x\"')\n", "qtime = kx.q('00:00:01')\n", "qtstamp = kx.q('rand 0p')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Convert the above items to a variety of the Python types. Change the method used to experiment as necessary" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(qbool.py())\n", "print(qguid.pd())\n", "print(qreal.np())\n", "print(qlong.pa())\n", "print(qsymb.py())\n", "print(qchar.np())\n", "print(qtime.pd())\n", "print(qtstamp.np())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Vector Conversions\n", "Define q items for conversion" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "qbool = kx.q('2?0b')\n", "qguid = kx.q('2?0Ng')\n", "qreal = kx.q('2?5e')\n", "qlong = kx.q('2?100')\n", "qsymb = kx.q('2?`4')\n", "qchar = kx.q('\"testing\"')\n", "qtime = kx.q('2?0t')\n", "qtstamp = kx.q('2?0p')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Convert the above items to a variety of the Python types. Change the method used to experiment as necessary" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(qbool.py())\n", "print(qguid.pd())\n", "print(qreal.np())\n", "print(qlong.pa())\n", "print(qsymb.py())\n", "print(qchar.np())\n", "print(qtime.pd())\n", "print(qtstamp.np())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Dictionary conversions\n", "Conversions between q dictionaries and Python are only supported for the `py()` method, numpy, pandas and pyarrow do not have appropriate equivalent representations and as such are not supported." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "qdict=kx.q('`x`y`z!(10?10e;10?0Ng;4?`2)')\n", "qdict.py()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Table conversions\n", "Conversions between q keyed and unkeyed tables to an appropriate Python representation are supported for the `py()`, `np()`, `pd()` and `pa()` methods.\n", "\n", "Round trip conversions `q -> Python -> q` are however only supported for Pandas and PyArrow. Conversions from Numpy records are still to be completed and the most natural representation for a table in native python is a dictionary as such the conversion from python to q returns a q dictionary rather than a table\n", "\n", "Define a q table containing all q data types for conversion" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kx.q('N:5')\n", "kx.q('gen_data:{@[;0;string]x#/:prd[x]?/:(`6;`6;0Ng;.Q.a),(\"xpdmnuvtbhijef\"$\\:0)}') # noqa\n", "kx.q('dset_1D:gen_data[enlist N]')\n", "kx.q('gen_names:{\"dset_\",/:x,/:string til count y}')\n", "\n", "qtab = kx.q('flip (`$gen_names[\"tab\";dset_1D])!N#\\'dset_1D') " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Convert the above table to a pandas dataframe and pyarrow table" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "display(qtab.pd())\n", "display(qtab.pa())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## K Object Properties and Methods\n", "\n", "### Miscellaneous Methods\n", "\n", "All K objects support the following methods/properties: \n", "\n", "| Method/Property | Description |\n", "|:----------------|:------------|\n", "| `t` | Return the q numeric datatype |\n", "| `is_atom` | Is the item a q atomic type? |" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "str(kx.q('([] til 3; `a`b`c)'))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "repr(kx.q('\"this is a char vector\"'))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kx.q('`atom').is_atom" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kx.q('`not`atom').is_atom" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(kx.q('([]10?1f;10?1f)').t)\n", "print(kx.q('`a`b`c!1 2 3').t)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# q list\n", "qlist = kx.q('(1 2 3;1;\"abc\")')\n", "list(qlist)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note the difference between this and the conversion of the same `qlist` to a true Python representation" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "qlist.py()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Numerical comparisons/functions\n", "Various q datatypes vectors/atoms/tables can also interact with native Python mathematical comparisons and functions, the following provides an outline of a subset of the comparisons/functions that are supported:\n", "\n", "| Function | Description |\n", "|:---------|:------------|\n", "| `abs` | Absolute value of a number |\n", "| `<` | Less than |\n", "| `>=` | Greater than or equal to |\n", "| `+` | Addition |\n", "| `-` | Subtraction |\n", "| `/` | Division |\n", "| `*` | Multiplication |\n", "| `**` | Power |\n", "| `%` | Modulo | \n", "\n", "#### Define q/Python atoms and lists for comparisons" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "qlong = kx.q('-5')\n", "pylong = 5\n", "qlist = kx.q('-3+til 5')\n", "pylist = [1, 2, 3, 4, 5]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Apply a number of the above comparisons/functions to python/q objects in combination" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(abs(qlong))\n", "print(abs(qlist))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(qlong>pylong)\n", "print(pylist>qlist)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(qlong*pylong)\n", "print(pylist*qlist)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### The `raw` q -> Python conversion keyword argument" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "All of the interfaces to the K objects support the `raw` keyword argument. When the `raw` keyword argument is set to `True` the interface forgoes some of the features when converting the object in exchange for greater efficiency." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tab = kx.q('([]10?1f;10?1f;10?0p;10?0Ng)')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tab.pd()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tab.pd(raw=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "qvec = kx.q('10?0t')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "qvec.np()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "qvec.np(raw=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Editing K objects\n", "One of the expected aspects of interacting with Python objects natively is being able to index, slice, compare and modify the objects when it is reasonable to do so.\n", "\n", "The following sections show the interaction of a user with a q vector and table\n", "\n", "#### Vectors" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "v = kx.q('12?100')\n", "print(v)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get the element at index 2" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "v[2]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Retrieve a slice containing elements 3-5" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "v[3:6]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compare all vector elements to 50" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "v < 50" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Tables\n", "\n", "This only applies to in-memory tables" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tab = kx.q('([]4?5;4?`2;4?0p;4?0Ng)')\n", "tab.pd()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tab['x1']" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tab['x2'].py()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Splayed and Partitioned Tables\n", "\n", "Splayed and Partitioned tables are at present only partially supported. Users will be able to query the data and access information around the columns through the `keys` method but will not be able to retrieve the values contained within the data or convert to an analogous Python representation. These will raise a `NotImplementedError`.\n", "\n", "Research on this is still pending and any changes to support these conversions will be include an update here\n", "\n", "#### Splayed Tables" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tmp_dir = mkdtemp()\n", "orig_dir = os.getcwd()\n", "os.chdir(tmp_dir)\n", "kx.q('`:db/t/ set ([] a:til 3; b:\"xyz\"; c:-3?0Ng)')\n", "kx.q(r'\\l db')\n", "t_splayed = kx.q('t')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "List the columns that are represented in the splayed table" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "list(t_splayed.keys())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Query the Splayed table" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kx.q('?[`t;enlist(=;`a;1);0b;()]')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Attempt to evaluate the values method on the table" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "try:\n", " t_splayed.values()\n", "except NotImplementedError:\n", " print('NotImplementedError was raised', file=sys.stderr)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "os.chdir(orig_dir)\n", "shutil.rmtree(tmp_dir)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Partitioned Tables" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tmp_dir = mkdtemp()\n", "orig_dir = os.getcwd()\n", "os.chdir(tmp_dir)\n", "kx.q('`:db/2020.01/t/ set ([] a:til 3; b:\"xyz\"; c:-3?0Ng)')\n", "kx.q('`:db/2020.02/t/ set ([] a:1+til 3; b:\"cat\"; c:-3?0Ng)')\n", "kx.q('`:db/2020.03/t/ set ([] a:2+til 3; b:\"bat\"; c:-3?0Ng)')\n", "kx.q(r'\\l db')\n", "t_partitioned = kx.q('t')\n", "t_partitioned" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "List partitioned table columns" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "list(t_partitioned.keys())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Query partitioned table" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kx.q('?[`t;enlist(=;`a;1);0b;enlist[`c]!enlist`c]')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Attempt to convert partitioned table to a pandas dataframe" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "try:\n", " t_partitioned.pd()\n", "except NotImplementedError:\n", " pass" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "os.chdir(orig_dir)\n", "shutil.rmtree(tmp_dir)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### q Functions\n", "\n", "All functions defined in q can be called from PyKX via function objects. These function calls can take Python or q objects as input arguments. It is required that each argument being supplied to the function be convertible to a q representation using `kx.K(arg)`.\n", "\n", "Arguments can be provided either positionally, or as keyword arguments when the q function has named parameters." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "f = kx.q('{x*y+z}')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "f(12, 2, 1)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "f(12, 2, 1).py()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "g = kx.q('{[arg1;arg2] deltas sum each arg1 cross til arg2}')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "g(arg2=7, arg1=kx.q('3?45')).np()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tok = kx.q(\"$'\")\n", "print(repr(tok))\n", "print(str(tok))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tok(kx.q('\"B\"'), kx.q('\" \",.Q.an')).np()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Context Interface\n", "\n", "The context interface provides a convenient way to interact with q contexts and namespaces using either the embedded q instance `pykx.q` or an IPC connection made with `pykx.QConnection`.\n", "\n", "Accessing an attribute which is not defined via the context interface, but which corresponds to a script (i.e. a `.q` or `.k` file), will cause it to be loaded automatically. Scripts are search for if they are:\n", "1. In the same directory as the process running PyKX\n", "2. In `QHOME`\n", "\n", "Other paths can be searched for by appending them to `kx.q.paths`. Alternatively, you can manually load a script with `kx.q.ctx._register`.\n", "\n", "Functions which are registered via the context interface are automatically added as callable members of their `QContext`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Builtin namespaces\n", "\n", "As a result of the infrastructure outlined above there are a number of namespaces which are automatically added as extensions to the q base class on loading. This includes the `.q`, `.z`, `.Q` and `.j` namespaces contained within `kx.q.k`, the following provides some example invocations of each.\n", "\n", "A number of the functions contained within the .z namespace are not callable, including but not limited to the following:\n", "\n", "- .z.ts\n", "- .z.ex\n", "- .z.ey\n", "\n", "Run `dir(kx.q.z)` to see what is available in the `.z` namespace.\n", "\n", "#### .q functionality\n", "All the functions a user would expect to be exposed from q are callable as python methods off the q base class, the following provides a limited number of example invocations" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(kx.q.til(10))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(kx.q.max([100, 2, 3, -4]))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(kx.q.mavg(4, kx.q.til(10)))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(kx.q.tables())" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "s = kx.q('([]a:1 2;b:2 3;c:5 7)')\n", "s" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "t = kx.q('([]a:1 2 3;b:2 3 7;c:10 20 30;d:\"ABC\")').pd()\n", "t" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kx.q.uj(s,t)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### `.Q` namespace\n", "The functions within the `.Q` namespace are also exposed as an extension.\n", "\n", "**Note**: While all functions within the `.Q` namespace are available, compared to the `.q`/`.z` namespaces these functions can be complicated to implement within the constraints of the PyKX interface for example `.Q.dpft` can be implemented but requires some thought" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kx.q.Q" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kx.q.Q.an" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kx.q.Q.btoa(b'Hello World!')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "t = kx.q('([]a:3 4 5;b:\"abc\";c:(2;3.4 3.2;\"ab\"))')\n", "kx.q.each(kx.q.Q.ty, t['a','b','c'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### `.j` namespace" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "json = b'{\"x\":1, \"y\":\"test\"}'\n", "qdict = kx.q.j.k(json)\n", "print(qdict)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kx.q.j.j(qdict).py()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### User defined extensions\n", "As alluded to above users can add their own extension modules to PyKX by placing a relevant `.q`/`.k` to their `$QHOME`. The following shows the addition of an extension to complete a specific query and set some data which we would like to be available.\n", "\n", "#### Extension Example\n", "The following example we will create (and later delete) the file '$QHOME/demo_extension.q'" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "demo_extension_source = '''\n", "\\d .demo_extension\n", "N:100\n", "test_data:([]N?`a`b`c;N?1f;N?10;N?0b)\n", "test_function:{[data]\n", " analytic_keys :`max_x1`avg_x2`med_x3;\n", " analytic_calcs:(\n", " (max;`x1);\n", " (avg;`x2);\n", " (med;`x3));\n", " ?[data;\n", " ();\n", " k!k:enlist `x;\n", " analytic_keys!analytic_calcs\n", " ]\n", " }\n", "'''\n", "demo_extension_filename = kx.qhome/'demo_extension.q'\n", "with open(demo_extension_filename, 'w') as f:\n", " f.write(demo_extension_source)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kx.q.demo_extension.test_data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kx.q.demo_extension.test_function" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kx.q.demo_extension.test_function(kx.q.demo_extension.test_data)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "os.remove(demo_extension_filename)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "--- " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Querying Interface\n", "\n", "One of the core purposes of this module is to provide users who are unfamiliar with q with a Pythonic approaches to interacting with q objects.\n", "\n", "One of the ways this is intended to be achieved is to provide Pythonic wrappers around common q tasks in a way that feels familiar to a Python developer but is still efficient/flexible.\n", "\n", "The querying interface is an example of this. It provides a wrapper around the q functional select syntax to facilitate the querying of persisted and local data while also allowing Python objects to be used as inputs where it is relevant.\n", "\n", "### help is provided\n", "Users can use the Python `help` function to display the docstring associated with each of the functions within the `query` module" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# help(kx.q.qsql)\n", "# help(kx.q.qsql.select)\n", "# help(kx.q.qsql.exec)\n", "# help(kx.q.qsql.update)\n", "# help(kx.q.qsql.delete)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Select functionality\n", "The select functionality is provided both as an individually callable function or as a method off all tabular data.\n", "\n", "Generate a table and assign the Python object as a named entity within the q memory space." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "qtab = kx.q('([]col1:100?`a`b`c;col2:100?1f;col3:100?5)')\n", "kx.q['qtab'] = qtab" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Retrieve the entirety of the table using an empty select" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kx.q.qsql.select(qtab)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Retrieve the entire table using the module function" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kx.q.qsql.select(qtab)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Retrieve the entire table based on a named reference\n", "\n", "This is important because it provides a method of querying partitioned/splayed tables" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kx.q.qsql.select('qtab')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**The where keyword**\n", "\n", "Where clauses can be provided as a named keyword and are expected to be formatted as an individual string or a list of strings as in the following examples.\n", "\n", "By default no where conditions are applied to a select query" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# kx.q.qsql.select(qtab, where='col1=`a')\n", "kx.q.qsql.select(qtab, where=['col3<0.5', 'col2>0.7'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**The columns keyword**\n", "\n", "The columns keyword is used to apply analytics to specific columns of the data or to select and rename columns within the dataset.\n", "\n", "By default if a user does not provide this information it is assumed that all columns are to be returned without modification.\n", "\n", "The columns keyword is expected to be a dictionary mapping the name that the new table will display for the column to the logic with which this data is modified." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kx.q.qsql.select(qtab, columns={'col1': 'col1','newname': 'col2'})" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kx.q.qsql.select(qtab, columns={'max_col2': 'max col2'}, where='col1=`a')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**The by keyword**\n", "\n", "The by keyword is used to apply analytics to group data based on common characteristics.\n", "\n", "By default if a user does not provide this information it is assumed that no grouping ins applied.\n", "\n", "The by keyword is expected to be a dictionary mapping the name to be applied to the by clause of the grouping to the column of the original table which is being used for the grouping." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kx.q.qsql.select(\n", " qtab,\n", " columns={'minCol2': 'min col2', 'medCol3': 'med col3'},\n", " by={'groupCol1': 'col1'},\n", " where=['col3<0.5', 'col2>0.7']\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Delete functionality\n", "The delete functionality is provided both as an individually callable function or as a method off all tabular data. \n", "\n", "The following provides a outline of how this can be invoked in both cases.\n", "\n", "**Note**: By default the delete functionality **does not** modify the underlying representation of the table. This is possible under limited circumstances as is outline in a later section below." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kx.q.qsql.delete(qtab)\n", "kx.q.qsql.delete('qtab')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**The columns keyword**\n", "\n", "The columns keyword is used to denote the columns that are to be deleted from a table.\n", "\n", "By default if a user does not provide this information it is assumed that all columns are to be deleted.\n", "\n", "The columns keyword is expected to be a string or list of strings denoting the columns to be deleted.\n", "\n", "**Note**: The columns and where clause can not be used in the same function call, this is not supported by the underlying functional delete." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# kx.q.qsql.delete(qtab, columns = 'col3')\n", "kx.q.qsql.delete(qtab, columns = ['col1','col2'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**The where keyword**\n", "\n", "The where keyword is used to filter rows of the data to be deleted.\n", "\n", "By default if no where condition is supplied it is assumed that all rows of the dataset are to be deleted.\n", "\n", "The where keyword is expected when not default to be a string on which to apply the filtering\n", "\n", "**Note**: The columns and where clause can not be used in the same function call, this is not supported by the underlying functional delete." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kx.q.qsql.delete(qtab, where='col1 in `a`b')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**The modify keyword**\n", "\n", "The modify keyword is used when the user intends for the underlying representation of a named entity within the q memory space to be modified. This is only applicable when calling the function via the `kx.q.qsql.delete` representation of the function.\n", "\n", "By default the underlying representation is not modified with `modify=False` in order to change the underlying representation a user must set `modify=True`" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kx.q.qsql.delete('qtab', where = 'col1=`c', modify=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kx.q('qtab')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Update and exec functionality\n", "\n", "Both the q functional update and exec functionality are supported by this interface. For brevity they are not shown in the same detail as the previous examples" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# kx.q.qsql.exec(qtab, 'col1')\n", "# kx.q.qsql.exec(qtab, columns='col2', by='col1')\n", "kx.q.qsql.exec(qtab, columns={'avgCol3': 'avg col3'}, by='col1')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# kx.q.qsql.update({'avg_col2':'avg col2'}, by={'col1': 'col1'})\n", "# kx.q.qsql.update({'col3':100}, where='col1=`a')\n", "kx.q.qsql.update('qtab', {'col2': 4.2}, 'col1=`b', modify=True)\n", "kx.q['qtab']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## IPC Communication\n", "\n", "This module also provides users with the ability to retrieve data from remote q processes. This is supported in the absence and presence of a valid q license.\n", "\n", "More documentation including exhaustive lists of the functionality available can be found in the [`IPC`](../api/ipc.html) documentation." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Establishing a Connection\n", "Connections to external q processes are established using the `pykx.QConnection` class. On initialization the instance of this class will establish a connection to the specified q process using the provided connection information (e.g. `host`, `port`, `username`, `password`, etc.). Refer to the PyKX IPC module documentation for more details about this interface, or run `help(pykx.QConnection)`.\n", "\n", "### IPC Example\n", "The following is a basic example of this functionality a more complex subscriber/publisher example is provided in `examples/ipc/`\n", "\n", "This example will work in the presence or absence of a valid q license \n", "\n", "#### Create the external q process\n", "To run this example, the Python code in the following cell will do the equivalent to executing the following in a terminal:\n", "\n", "```\n", "$ q -p 5000\n", "q)tab:([]100?`a`b`c;100?1f;100?0Ng)\n", "q).z.ps:{[x]0N!(`.z.ps;x);value x}\n", "q).z.pg:{[x]0N!(`.z.pg;x);value x}\n", "```" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import subprocess\n", "import time\n", "proc = subprocess.Popen(\n", " ('q', '-p', '5000'),\n", " stdin=subprocess.PIPE,\n", " stdout=subprocess.DEVNULL,\n", " stderr=subprocess.DEVNULL,\n", ")\n", "proc.stdin.write(b'tab:([]100?`a`b`c;100?1f;100?0Ng)\\n')\n", "proc.stdin.write(b'.z.ps:{[x]0N!(`.z.ps;x);value x}\\n')\n", "proc.stdin.write(b'.z.pg:{[x]0N!(`.z.pg;x);value x}\\n')\n", "proc.stdin.flush()\n", "time.sleep(2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Open a connection to this process" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Normally a `with` block would be used for proper context management, but for the sake of this example the connection will be accessed and closed directly\n", "conn = kx.QConnection('localhost', 5000)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Make a simple synchronous request" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "qvec = conn('2+til 2')\n", "qvec" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Make a simple asynchronous request" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "conn('setVec::10?1f', wait=False)\n", "setVec = conn('setVec')\n", "setVec" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Run a defined function server side with provided arguments" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pytab = pd.DataFrame({'col1': [1, 2, 3], 'col2': [4, 5, 6]})\n", "conn('{[table;column;rows]rows#column#table}', pytab, ['col1'], 1).pd()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "conn('{[table;column]newtab::table column}', pytab, 'col1', wait=False)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "conn('newtab').np()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Disconnect from the q process" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "conn.close()\n", "# This happens automatically when you leave a `with` block that is managing a connection, or when a connection is garbage-collected." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Shutdown the q process we were connected to for the IPC demo\n", "proc.stdin.close()\n", "proc.kill()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] } ], "metadata": { "file_extension": ".py()", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.5" }, "mimetype": "text/x-python", "name": "python", "npconvert_exporter": "python", "pygments_lexer": "ipython3", "version": 3 }, "nbformat": 4, "nbformat_minor": 2 }