Machine-learning toolkit¶
The toolkit contains a number of libraries and scripts. These have been produced to provide kdb+ users with general-use functions and procedures to perform machine-learning tasks on a wide variety of datasets.
The toolkit contains:
-
Utility functions relating to important aspects of machine learning including data preprocessing and statistical testing, and includes functions useful in many machine-learning applications.
-
An implementation of the FRESH (FeatuRe Extraction and Scalable Hypothesis testing) algorithm in q. This lets a q/kdb+ user perform feature-extraction and feature-significance tests on structured time-series data for forecasting, regression and classification.
Over time the machine-learning functionality in this library will be extended to include
- q-specific implementations of machine-learning algorithms
- broader functionality
Requirements¶
The following requirements cover all those needed to run the libraries in the current build of the toolkit.
A number of Python dependencies also exist for the running of embedPy functions within both the the machine-learning utilities and FRESH libraries. These can be installed as outlined at
KxSystems/ml
using pip
pip install -r requirements.txt
or via conda
;
conda install --file requirements.txt
Running notebooks
Running notebooks within the Utilities and FRESH sections requires JupyterQ as well as embedPy. However this is not a requirement for the toolkit itself.
Installation¶
Copy (a link to) the library into $QHOME
to install and load all libraries using
q)\l ml/ml.q
q).ml.loadfile`:init.q