Charting Data with PyKX¶
This workbook details example of interfacing PyKX with Python charting libraries.
PyKX supports rich datatype mapping meaning you can convert data from PyKX objects to:
- Python objects using
.py()
- NumPy objects using
.np()
- Pandas objects using
.pd()
- PyArrow objects using
.pa()
The full breakdown of how these map is documented here.
These resulting objects will behave as expected with all Python libraries.
For efficiency and exactness the examples below aim to use PyKX objects directly, minimising conversions when possible.
import pykx as kx
tab = kx.Table(data={
'sym':kx.random.random(1000, ['a', 'b', 'c']),
'price':kx.random.random(1000, 1.0),
'size':kx.random.random(1000, 10),
'quantity':kx.random.random(1000,100),
'in_stock':kx.random.random(1000, [True, False])})
tab.head()
sym | price | size | quantity | in_stock | |
---|---|---|---|---|---|
0 | a | 0.9094126 | 4 | 5 | 1b |
1 | a | 0.2988477 | 5 | 18 | 1b |
2 | c | 0.454063 | 8 | 11 | 0b |
3 | b | 0.156942 | 1 | 36 | 1b |
4 | c | 0.04699265 | 4 | 43 | 1b |
Matplotlib¶
Generating a scatter plot using the price
and size
columns of our table.
The scatter(tab['price'], tab['quantity'])
notation is used to access PyKX objects directly.
To use x=
and y=
syntax requires conversion to a dataframe using .pd()
.i.e scatter(tab.pd(), x='price' ,y='quantity')
scatter
fundamentally uses a series of 1D arrays and is therefore one of the only charts where the column values do not need to first be converted in Numpy objects using .np()
.
import matplotlib.pyplot as plt
plt.scatter(tab['price'], tab['quantity'])
plt.show()
In order for the column values to be compatible with most of matplotlib charts, they first must be converted to numpy objects using the .np()
function.
plt.bar(tab['size'].np(), tab['price'].np())
plt.show()
Plotly¶
Plotly allows vector
objects to be passed as the color
argument. This parameter is set using the sym
column resulting in the scatter chart below.
import plotly.express as px
fig = px.scatter(
x=tab['quantity'],
y=tab['price'],
size=tab['size'],
color=tab['sym'])
fig.show(renderer="png")
Unlike with Pandas, a PyKX table cannot be passed as the first argument with the following data being passed as column names. Each axis must be explicitly set.
To use this feature, first convert to Pandas using the .pd()
function
A density heatmap using Plotly. This time the table is converted to a Pandas Dataframe and then the axes are simply assigned the column names as strings.
fig = px.density_heatmap(
tab.pd(),
x='price',
y='size')
fig.show(renderer="png")
Seaborn¶
Seaborn allows the user to set data
as a PyKX table name without conversions and then call the x
and y
parameters using only the column names of that table.
A bar chart below demonstrates this with the data being set as the table object and all of the parameters being set using the column names, all without conversions.
import seaborn as sns
sns.catplot(
kind='bar',
data=tab,
x='size',
y='quantity',
hue='sym'
)
plt.show()
Seaborn supports joining plots together, allowing the user access to another layer of visualisation.
sns.jointplot(data=tab, x="quantity", y="price", hue="sym")
plt.show()