Machine learning¶

Machine learning

Machine-learning capabilities are at the heart of future technology development at KX.

Our libraries are released under the Apache 2 license, and are free for all use cases, including 64-bit and commercial use.

Machine Learning Toolkit¶

KxSystems/ml

The Machine Learning Toolkit is at the core of our machine-learning functionality. This library contains functions that cover the following areas.

Accuracy metrics to test the performance of constructed machine-learning models.
Pre-processing data prior to the application of machine-learning algorithms.
An implementation of the FRESH algorithm for feature extraction and selection on structured time series data.
Utility functions which are useful in many machine-learning applications but do not fall within the other sections of the toolkit.
Cross-Validation functions, used to verify how robust and stable a machine-learning model is to changes in the data being interrogated and the volume of this data.
Clustering algorithms used to group data points and to identify patterns in their distributions. The algorithms make use of a k-dimensional tree to store points and scoring functions to analyze how well they performed.

Example notebooks¶

KxSystems/mlnotebooks

Example notebooks show FRESH and various aspects of toolkit functionality.

Natural Language Processing¶

KxSystems/nlp

NLP manages the common functions associated with processing unstructured text. Functions for searching, clustering, keyword extraction and sentiment are included in the library.

Demonstration notebook

Automated Machine Learning¶

KxSystems/automl

AutoML is a framework to automate the process of machine learning using kdb+. This is build largely on the machine learning toolkit and handles the following aspects of a traditional machine-learning pipeline:

Data preprocessing
Feature engineering and feature selection
Model selection
Hyperparameter tuning
Report generation and model persistence

Demonstration notebook

embedPy¶

KxSystems/embedPy

EmbedPy loads Python into kdb+/q, allowing access to a rich ecosystem of libraries such as scikit-learn, tensorflow and pytorch.

Python variables and objects become q variables – and either language can act upon them.
Python code and files can be embedded within q code.
Python functions can be called as q functions.

Example notebooks using embedPy

JupyterQ¶

KxSystems/JupyterQ

JupyterQ supports Jupyter notebooks for q, providing

Syntax highlighting, code completion and help
Multiline input (script-like execution)
Inline display of charts

Technical papers¶

NASA FDL: Analyzing social media data for disaster management
Conor McCarthy, 2019.10
NASA FDL: Predicting floods with q and machine learning
Diane O’Donoghue, 2019.10
An introduction to neural networks with kdb+
James Neill, 2019.07
NASA FDL: Exoplanets Challenge
Esperanza López Aguilera, 2018.12
NASA FDL: Space Weather Challenge
Deanna Morgan, 2018.11
Using embedPy to apply LASSO regression
Samantha Gallagher, 2018.10
K-Nearest Neighbor classification and pattern recognition with q
Emanuele Melis, 2017.07

The KX machine-learning libraries are:

well documented, with understandable and useful examples
maintained and supported by KX on a best-efforts basis, at no cost to customers
released under the Apache 2 license
free for all use cases, including 64-bit and commercial use

Commercial support is available if required: please email sales@kx.com.