# Interface Overview
The purpose of this notebook is to provide a demonstration of the capabilities of PyKX for users who are familiar with q.

To follow along please download this notebook using the following <a href="./interface_overview.ipynb" download>'link.'</a>

This demonstration will outline the following

1. [Initializing the library](#initializing-the-library)
2. [Generating q objects](#creating-q-objects-from-python-objects)
3. [Converting q to Python](#converting-q-to-python)
4. [Interacting with q objects](#k-object-properties-and-methods)
5. [Context Interface](#context-interface)
6. [Querying Interface](#querying-interface)
7. [IPC communication](#ipc-communication)


## Initializing the library

### Non-PyKX Requirements

For the purpose of this demonstration the following Python libraries/modules are required

In [None]:
import os
import shutil
import sys
from tempfile import mkdtemp

import numpy as np
import pandas as pd
import pyarrow as pa

### Initialization

Once installed via pip, PyKX can be started by importing the module. This will initialize embedded q within the Python process if a valid q license is found (e.g. in `$QHOME` or `$QLIC`), or fall back to the unlicensed version if no such license is found. This notebook will use the licensed version of PyKX. To force the usage of the unlicensed version (and silence the warning that is raised when the fallback to the unlicensed version is employed) you can add `--unlicensed` to the environment variable `$QARGS`. `$QARGS` can be set to a string of arguments which will be used to initialize the embedded q instance, as if you had used those arguments to start q from the command line.

In [None]:
import warnings
warnings.filterwarnings('ignore') # Do not copy, as we are skipping symlinking pyKX to QHOME the core insights libraries will not be copied over and will raise warnings
os.environ['IGNORE_QHOME'] = '1' # Ignore symlinking PyKX q libraries to QHOME 
os.environ['PYKX_Q_LOADED_MARKER'] = '' # Only used here for running Notebook under mkdocs-jupyter during document generation.
import pykx as kx
kx.q.system.console_size = [10, 80]

### Evaluating q code using embedded q

In [None]:
kx.q('1+1')

In [None]:
kx.q('1 2 3 4f')

In [None]:
kx.q('([]2?1f;2?0Ng;2?0b)')

In [None]:
kx.q('`a`b`c!(til 10;`a`b`c;5?"abc")')

## Creating q objects from Python objects

One of the strengths of the PyKX interface is the flexibility in the representations of objects that can be converted from a native Python representation to a q equivalent.

By default data formatted in Python using the following libraries can be converted to a q equivalent representation.

* python native types
* numpy
* pandas
* pyarrow

These are all facilitated through use of the `K` method of the base `q` class shown before as follows

#### Atomic Structures

In [None]:
pyAtomic = 1.5
npAtomic = np.float64(1.5)
pdAtomic = pd.Series([1.5])
paAtomic = pa.array([1.5])

print(kx.K(pyAtomic))
# print(kx.K(npAtomic))
# print(kx.K(pdAtomic))
# print(kx.K(paAtomic))

#### Array/Series Structures

In [None]:
pyArray = [1, 2.5, "abc", b'defg']
npArray = np.array([1, 2.5, "abc", b'defg'], dtype = object)
pdSeries = pd.Series([pyArray])
paArray = pa.array([1, 2, 3])

print(kx.K(pyArray))
# print(kx.K(npArray))
# print(kx.K(pdSeries))
# print(kx.K(paArray))

#### Tabular data
Round trip support for tabular data is presently supported for Pandas Dataframes and PyArrow tables

In [None]:
pdtable = pd.DataFrame({'col1': [1, 2],
                        'col2': [2., 3.],
                        'col3': ['Hello', 'World']})
patable = pa.Table.from_pandas(pdtable)

display(kx.K(pdtable))
# display(kx.K(patable))

---

## Converting q to Python
All K objects support one or more of the following methods: `py()`, `np()`, `pd()` or `pa()`

These methods provide an interface to the K object such that they can be converted to an analogous Python, Numpy, Pandas or PyArrow object respectively. 

Whether the view is a copy or not varies:

1. The 'py' property always provides a copy.
2. The 'np' property does not copy unless the data cannot be interpreted by Numpy properly without changing it. For example, all temporal types in Numpy take 64 bits per item, so the 32 bit q temporal types must be copied to be represented as Numpy 'datetime64'/'timedelta64' elements. In cases where copying is unacceptable, the raw keyword argument can be set to true as demonstrated below.
3. The 'pd' property leverages the 'np' property to create Pandas objects, as such the same restrictions apply to it.
4. The 'pa' property leverages the 'pd' property to create PyArrow objects, as such the same restrictions apply to it.

### Atomic Conversions
Define q items for conversion

In [None]:
qbool = kx.q('0b')
qguid = kx.q('"G"$"00000000-0000-0000-0000-000000000001"')
qreal = kx.q('1.5e')
qlong = kx.q('1234')
qsymb = kx.q('`test')
qchar = kx.q('"x"')
qtime = kx.q('00:00:01')
qtstamp = kx.q('rand 0p')

Convert the above items to a variety of the Python types. Change the method used to experiment as necessary

In [None]:
print(qbool.py())
print(qguid.pd())
print(qreal.np())
print(qlong.pa())
print(qsymb.py())
print(qchar.np())
print(qtime.pd())
print(qtstamp.np())

### Vector Conversions
Define q items for conversion

In [None]:
qbool = kx.q('2?0b')
qguid = kx.q('2?0Ng')
qreal = kx.q('2?5e')
qlong = kx.q('2?100')
qsymb = kx.q('2?`4')
qchar = kx.q('"testing"')
qtime = kx.q('2?0t')
qtstamp = kx.q('2?0p')

Convert the above items to a variety of the Python types. Change the method used to experiment as necessary

In [None]:
print(qbool.py())
print(qguid.pd())
print(qreal.np())
print(qlong.pa())
print(qsymb.py())
print(qchar.np())
print(qtime.pd())
print(qtstamp.np())

### Dictionary conversions
Conversions between q dictionaries and Python are only supported for the `py()` method, numpy, pandas and pyarrow do not have appropriate equivalent representations and as such are not supported.

In [None]:
qdict=kx.q('`x`y`z!(10?10e;10?0Ng;4?`2)')
qdict.py()

### Table conversions
Conversions between q keyed and unkeyed tables to an appropriate Python representation are supported for the `py()`, `np()`, `pd()` and `pa()` methods.

Round trip conversions `q -> Python -> q` are however only supported for Pandas and PyArrow. Conversions from Numpy records are still to be completed and the most natural representation for a table in native python is a dictionary as such the conversion from python to q returns a q dictionary rather than a table

Define a q table containing all q data types for conversion

In [None]:
kx.q('N:5')
kx.q('gen_data:{@[;0;string]x#/:prd[x]?/:(`6;`6;0Ng;.Q.a),("xpdmnuvtbhijef"$\:0)}') # noqa
kx.q('dset_1D:gen_data[enlist N]')
kx.q('gen_names:{"dset_",/:x,/:string til count y}')

qtab = kx.q('flip (`$gen_names["tab";dset_1D])!N#\'dset_1D') 

Convert the above table to a pandas dataframe and pyarrow table

In [None]:
display(qtab.pd())
display(qtab.pa())

---

## K Object Properties and Methods

### Miscellaneous Methods

All K objects support the following methods/properties: 

| Method/Property | Description |
|:----------------|:------------|
| `t`             | Return the q numeric datatype |
| `is_atom`       | Is the item a q atomic type? |

In [None]:
str(kx.q('([] til 3; `a`b`c)'))

In [None]:
repr(kx.q('"this is a char vector"'))

In [None]:
kx.q('`atom').is_atom

In [None]:
kx.q('`not`atom').is_atom

In [None]:
print(kx.q('([]10?1f;10?1f)').t)
print(kx.q('`a`b`c!1 2 3').t)

In [None]:
# q list
qlist = kx.q('(1 2 3;1;"abc")')
list(qlist)

Note the difference between this and the conversion of the same `qlist` to a true Python representation

In [None]:
qlist.py()

### Numerical comparisons/functions
Various q datatypes vectors/atoms/tables can also interact with native Python mathematical comparisons and functions, the following provides an outline of a subset of the comparisons/functions that are supported:

| Function | Description |
|:---------|:------------|
| `abs`    | Absolute value of a number |
| `<`      | Less than |
| `>=`     | Greater than or equal to |
| `+`      | Addition |
| `-`      | Subtraction |
| `/`      | Division |
| `*`      | Multiplication |
| `**`     | Power |
| `%`      | Modulo | 

#### Define q/Python atoms and lists for comparisons

In [None]:
qlong = kx.q('-5')
pylong = 5
qlist = kx.q('-3+til 5')
pylist = [1, 2, 3, 4, 5]

#### Apply a number of the above comparisons/functions to python/q objects in combination

In [None]:
print(abs(qlong))
print(abs(qlist))

In [None]:
print(qlong>pylong)
print(pylist>qlist)

In [None]:
print(qlong*pylong)
print(pylist*qlist)

### The `raw` q -> Python conversion keyword argument

All of the interfaces to the K objects support the `raw` keyword argument. When the `raw` keyword argument is set to `True` the interface forgoes some of the features when converting the object in exchange for greater efficiency.

In [None]:
tab = kx.q('([]10?1f;10?1f;10?0p;10?0Ng)')

In [None]:
tab.pd()

In [None]:
tab.pd(raw=True)

In [None]:
qvec = kx.q('10?0t')

In [None]:
qvec.np()

In [None]:
qvec.np(raw=True)

### Editing K objects
One of the expected aspects of interacting with Python objects natively is being able to index, slice, compare and modify the objects when it is reasonable to do so.

The following sections show the interaction of a user with a q vector and table

#### Vectors

In [None]:
v = kx.q('12?100')
print(v)

Get the element at index 2

In [None]:
v[2]

Retrieve a slice containing elements 3-5

In [None]:
v[3:6]

Compare all vector elements to 50

In [None]:
v < 50

#### Tables

This only applies to in-memory tables

In [None]:
tab = kx.q('([]4?5;4?`2;4?0p;4?0Ng)')
tab.pd()

In [None]:
tab['x1']

In [None]:
tab['x2'].py()

### Splayed and Partitioned Tables

Splayed and Partitioned tables are at present only partially supported. Users will be able to query the data and access information around the columns through the `keys` method but will not be able to retrieve the values contained within the data or convert to an analogous Python representation. These will raise a `NotImplementedError`.

Research on this is still pending and any changes to support these conversions will be include an update here

#### Splayed Tables

In [None]:
tmp_dir = mkdtemp()
orig_dir = os.getcwd()
os.chdir(tmp_dir)
kx.q('`:db/t/ set ([] a:til 3; b:"xyz"; c:-3?0Ng)')
kx.q(r'\l db')
t_splayed = kx.q('t')

List the columns that are represented in the splayed table

In [None]:
list(t_splayed.keys())

Query the Splayed table

In [None]:
kx.q('?[`t;enlist(=;`a;1);0b;()]')

Attempt to evaluate the values method on the table

In [None]:
try:
    t_splayed.values()
except NotImplementedError:
    print('NotImplementedError was raised', file=sys.stderr)

In [None]:
os.chdir(orig_dir)
shutil.rmtree(tmp_dir)

#### Partitioned Tables

In [None]:
tmp_dir = mkdtemp()
orig_dir = os.getcwd()
os.chdir(tmp_dir)
kx.q('`:db/2020.01/t/ set ([] a:til 3; b:"xyz"; c:-3?0Ng)')
kx.q('`:db/2020.02/t/ set ([] a:1+til 3; b:"cat"; c:-3?0Ng)')
kx.q('`:db/2020.03/t/ set ([] a:2+til 3; b:"bat"; c:-3?0Ng)')
kx.q(r'\l db')
t_partitioned = kx.q('t')
t_partitioned

List partitioned table columns

In [None]:
list(t_partitioned.keys())

Query partitioned table

In [None]:
kx.q('?[`t;enlist(=;`a;1);0b;enlist[`c]!enlist`c]')

Attempt to convert partitioned table to a pandas dataframe

In [None]:
try:
    t_partitioned.pd()
except NotImplementedError:
    pass

In [None]:
os.chdir(orig_dir)
shutil.rmtree(tmp_dir)

### q Functions

All functions defined in q can be called from PyKX via function objects. These function calls can take Python or q objects as input arguments. It is required that each argument being supplied to the function be convertible to a q representation using `kx.K(arg)`.

Arguments can be provided either positionally, or as keyword arguments when the q function has named parameters.

In [None]:
f = kx.q('{x*y+z}')

In [None]:
f(12, 2, 1)

In [None]:
f(12, 2, 1).py()

In [None]:
g = kx.q('{[arg1;arg2] deltas sum each arg1 cross til arg2}')

In [None]:
g(arg2=7, arg1=kx.q('3?45')).np()

In [None]:
tok = kx.q("$'")
print(repr(tok))
print(str(tok))

In [None]:
tok(kx.q('"B"'), kx.q('" ",.Q.an')).np()

---

## Context Interface

The context interface provides a convenient way to interact with q contexts and namespaces using either the embedded q instance `pykx.q` or an IPC connection made with `pykx.QConnection`.

Accessing an attribute which is not defined via the context interface, but which corresponds to a script (i.e. a `.q` or `.k` file), will cause it to be loaded automatically. Scripts are search for if they are:
1. In the same directory as the process running PyKX
2. In `QHOME`

Other paths can be searched for by appending them to `kx.q.paths`. Alternatively, you can manually load a script with `kx.q.ctx._register`.

Functions which are registered via the context interface are automatically added as callable members of their `QContext`.

### Builtin namespaces

As a result of the infrastructure outlined above there are a number of namespaces which are automatically added as extensions to the q base class on loading. This includes the `.q`, `.z`, `.Q` and `.j` namespaces contained within `kx.q.k`, the following provides some example invocations of each.

A number of the functions contained within the .z namespace are not callable, including but not limited to the following:

- .z.ts
- .z.ex
- .z.ey

Run `dir(kx.q.z)` to see what is available in the `.z` namespace.

#### .q functionality
All the functions a user would expect to be exposed from q are callable as python methods off the q base class, the following provides a limited number of example invocations

In [None]:
print(kx.q.til(10))

In [None]:
print(kx.q.max([100, 2, 3, -4]))

In [None]:
print(kx.q.mavg(4, kx.q.til(10)))

In [None]:
print(kx.q.tables())

In [None]:
s = kx.q('([]a:1 2;b:2 3;c:5 7)')
s

In [None]:
t = kx.q('([]a:1 2 3;b:2 3 7;c:10 20 30;d:"ABC")').pd()
t

In [None]:
kx.q.uj(s,t)

### `.Q` namespace
The functions within the `.Q` namespace are also exposed as an extension.

**Note**: While all functions within the `.Q` namespace are available, compared to the `.q`/`.z` namespaces these functions can be complicated to implement within the constraints of the PyKX interface for example `.Q.dpft` can be implemented but requires some thought

In [None]:
kx.q.Q

In [None]:
kx.q.Q.an

In [None]:
kx.q.Q.btoa(b'Hello World!')

In [None]:
t = kx.q('([]a:3 4 5;b:"abc";c:(2;3.4 3.2;"ab"))')
kx.q.each(kx.q.Q.ty, t['a','b','c'])

### `.j` namespace

In [None]:
json = b'{"x":1, "y":"test"}'
qdict = kx.q.j.k(json)
print(qdict)

In [None]:
kx.q.j.j(qdict).py()

### User defined extensions
As alluded to above users can add their own extension modules to PyKX by placing a relevant `.q`/`.k` to their `$QHOME`. The following shows the addition of an extension to complete a specific query and set some data which we would like to be available.

#### Extension Example
The following example we will create (and later delete) the file '$QHOME/demo_extension.q'

In [None]:
demo_extension_source = '''
\d .demo_extension
N:100
test_data:([]N?`a`b`c;N?1f;N?10;N?0b)
test_function:{[data]
  analytic_keys :`max_x1`avg_x2`med_x3;
  analytic_calcs:(
    (max;`x1);
    (avg;`x2);
    (med;`x3));
  ?[data;
    ();
    k!k:enlist `x;
    analytic_keys!analytic_calcs
    ]
  }
'''
demo_extension_filename = kx.qhome/'demo_extension.q'
with open(demo_extension_filename, 'w') as f:
    f.write(demo_extension_source)


In [None]:
kx.q.demo_extension.test_data

In [None]:
kx.q.demo_extension.test_function

In [None]:
kx.q.demo_extension.test_function(kx.q.demo_extension.test_data)

In [None]:
os.remove(demo_extension_filename)

--- 

## Querying Interface

One of the core purposes of this module is to provide users who are unfamiliar with q with a Pythonic approaches to interacting with q objects.

One of the ways this is intended to be achieved is to provide Pythonic wrappers around common q tasks in a way that feels familiar to a Python developer but is still efficient/flexible.

The querying interface is an example of this. It provides a wrapper around the q functional select syntax to facilitate the querying of persisted and local data while also allowing Python objects to be used as inputs where it is relevant.

### help is provided
Users can use the Python `help` function to display the docstring associated with each of the functions within the `query` module

In [None]:
# help(kx.q.qsql)
# help(kx.q.qsql.select)
# help(kx.q.qsql.exec)
# help(kx.q.qsql.update)
# help(kx.q.qsql.delete)

### Select functionality
The select functionality is provided both as an individually callable function or as a method off all tabular data.

Generate a table and assign the Python object as a named entity within the q memory space.

In [None]:
qtab = kx.q('([]col1:100?`a`b`c;col2:100?1f;col3:100?5)')
kx.q['qtab'] = qtab

Retrieve the entirety of the table using an empty select

In [None]:
kx.q.qsql.select(qtab)

Retrieve the entire table using the module function

In [None]:
kx.q.qsql.select(qtab)

Retrieve the entire table based on a named reference

This is important because it provides a method of querying partitioned/splayed tables

In [None]:
kx.q.qsql.select('qtab')

**The where keyword**

Where clauses can be provided as a named keyword and are expected to be formatted as an individual string or a list of strings as in the following examples.

By default no where conditions are applied to a select query

In [None]:
# kx.q.qsql.select(qtab, where='col1=`a')
kx.q.qsql.select(qtab, where=['col3<0.5', 'col2>0.7'])

**The columns keyword**

The columns keyword is used to apply analytics to specific columns of the data or to select and rename columns within the dataset.

By default if a user does not provide this information it is assumed that all columns are to be returned without modification.

The columns keyword is expected to be a dictionary mapping the name that the new table will display for the column to the logic with which this data is modified.

In [None]:
kx.q.qsql.select(qtab, columns={'col1': 'col1','newname': 'col2'})

In [None]:
kx.q.qsql.select(qtab, columns={'max_col2': 'max col2'}, where='col1=`a')

**The by keyword**

The by keyword is used to apply analytics to group data based on common characteristics.

By default if a user does not provide this information it is assumed that no grouping ins applied.

The by keyword is expected to be a dictionary mapping the name to be applied to the by clause of the grouping to the column of the original table which is being used for the grouping.

In [None]:
kx.q.qsql.select(
    qtab,
    columns={'minCol2': 'min col2', 'medCol3': 'med col3'},
    by={'groupCol1': 'col1'},
    where=['col3<0.5', 'col2>0.7']
)

### Delete functionality
The delete functionality is provided both as an individually callable function or as a method off all tabular data. 

The following provides a outline of how this can be invoked in both cases.

**Note**: By default the delete functionality **does not** modify the underlying representation of the table. This is possible under limited circumstances as is outline in a later section below.

In [None]:
kx.q.qsql.delete(qtab)
kx.q.qsql.delete('qtab')

**The columns keyword**

The columns keyword is used to denote the columns that are to be deleted from a table.

By default if a user does not provide this information it is assumed that all columns are to be deleted.

The columns keyword is expected to be a string or list of strings denoting the columns to be deleted.

**Note**: The columns and where clause can not be used in the same function call, this is not supported by the underlying functional delete.

In [None]:
# kx.q.qsql.delete(qtab, columns = 'col3')
kx.q.qsql.delete(qtab, columns = ['col1','col2'])

**The where keyword**

The where keyword is used to filter rows of the data to be deleted.

By default if no where condition is supplied it is assumed that all rows of the dataset are to be deleted.

The where keyword is expected when not default to be a string on which to apply the filtering

**Note**: The columns and where clause can not be used in the same function call, this is not supported by the underlying functional delete.

In [None]:
kx.q.qsql.delete(qtab, where='col1 in `a`b')

**The modify keyword**

The modify keyword is used when the user intends for the underlying representation of a named entity within the q memory space to be modified. This is only applicable when calling the function via the `kx.q.qsql.delete` representation of the function.

By default the underlying representation is not modified with `modify=False` in order to change the underlying representation a user must set `modify=True`

In [None]:
kx.q.qsql.delete('qtab', where = 'col1=`c', modify=True)

In [None]:
kx.q('qtab')

### Update and exec functionality

Both the q functional update and exec functionality are supported by this interface. For brevity they are not shown in the same detail as the previous examples

In [None]:
# kx.q.qsql.exec(qtab, 'col1')
# kx.q.qsql.exec(qtab, columns='col2', by='col1')
kx.q.qsql.exec(qtab, columns={'avgCol3': 'avg col3'}, by='col1')

In [None]:
# kx.q.qsql.update({'avg_col2':'avg col2'}, by={'col1': 'col1'})
# kx.q.qsql.update({'col3':100}, where='col1=`a')
kx.q.qsql.update('qtab', {'col2': 4.2}, 'col1=`b', modify=True)
kx.q['qtab']

---

## IPC Communication

This module also provides users with the ability to retrieve data from remote q processes. This is supported in the absence and presence of a valid q license.

More documentation including exhaustive lists of the functionality available can be found in the [`IPC`](../api/ipc.html) documentation.

### Establishing a Connection
Connections to external q processes are established using the `pykx.QConnection` class. On initialization the instance of this class will establish a connection to the specified q process using the provided connection information (e.g. `host`, `port`, `username`, `password`, etc.). Refer to the PyKX IPC module documentation for more details about this interface, or run `help(pykx.QConnection)`.

### IPC Example
The following is a basic example of this functionality a more complex subscriber/publisher example is provided in `examples/ipc/`

This example will work in the presence or absence of a valid q license 

####  Create the external q process
To run this example, the Python code in the following cell will do the equivalent to executing the following in a terminal:

```
$ q -p 5000
q)tab:([]100?`a`b`c;100?1f;100?0Ng)
q).z.ps:{[x]0N!(`.z.ps;x);value x}
q).z.pg:{[x]0N!(`.z.pg;x);value x}
```

In [None]:
import subprocess
import time
proc = subprocess.Popen(
    ('q', '-p', '5000'),
    stdin=subprocess.PIPE,
    stdout=subprocess.DEVNULL,
    stderr=subprocess.DEVNULL,
)
proc.stdin.write(b'tab:([]100?`a`b`c;100?1f;100?0Ng)\n')
proc.stdin.write(b'.z.ps:{[x]0N!(`.z.ps;x);value x}\n')
proc.stdin.write(b'.z.pg:{[x]0N!(`.z.pg;x);value x}\n')
proc.stdin.flush()
time.sleep(2)

#### Open a connection to this process

In [None]:
# Normally a `with` block would be used for proper context management, but for the sake of this example the connection will be accessed and closed directly
conn = kx.QConnection('localhost', 5000)

#### Make a simple synchronous request

In [None]:
qvec = conn('2+til 2')
qvec

#### Make a simple asynchronous request

In [None]:
conn('setVec::10?1f', wait=False)
setVec = conn('setVec')
setVec

#### Run a defined function server side with provided arguments

In [None]:
pytab = pd.DataFrame({'col1': [1, 2, 3], 'col2': [4, 5, 6]})
conn('{[table;column;rows]rows#column#table}', pytab, ['col1'], 1).pd()

In [None]:
conn('{[table;column]newtab::table column}', pytab, 'col1', wait=False)

In [None]:
conn('newtab').np()

#### Disconnect from the q process

In [None]:
conn.close()
# This happens automatically when you leave a `with` block that is managing a connection, or when a connection is garbage-collected.

In [None]:
# Shutdown the q process we were connected to for the IPC demo
proc.stdin.close()
proc.kill()

---