Skip to content

Saving

Persist a variety of versioned entities to disk and cloud storage applications

.ml.registry. new.registry generate a new registry new.experiment generate a new experiment within an existing registry set.model add a new model set.parameters generate a JSON file containing parameters log.metric log metric values associated with a model

.ml.registry.log.metric

Log metric values associated with a model

.ml.registry.log.metric
  [folderPath;experimentName;modelName;version;metricName;metricValue]

Where

argument type description
folderPath string, :: folder path of the registry; if null, current directory
experimentName string, :: name of an experiment from which to retrieve a model; if modelName null, the newest model within this experiment; if both modelName and experimentName null, the newest model in the unnamedExperiments section
modelName string, :: name of the model to be retrieved; if null, the newest model associated with the experiment
version long[], :: specific version of a named model to retrieve, major and minor versions as a pair of longs; if null, the newest model
metricName string name of the metric to be persisted
metricValue float value of the metric to be persisted

logs metric values associated with the model, and returns a generic null.

When logging metrics, a persisted binary table is generated within the model registry containing the following information

  • time the metric value was added
  • name of the persisted metric
  • value of the persisted metric
In the cloud

When generating a new registry within the context of cloud vendor interactions the folderPath variable is unused and the registry location is assumed to be the storage location provided on initialization.

Create a model within the registry:

.ml.registry.set.model[::;{x+1};"metricModel";"q";::]

Log metric values associated with various metric names:

.ml.registry.log.metric[::;::;"metricModel";1 0;`func1;2.4]
.ml.registry.log.metric[::;::;"metricModel";1 0;`func1;3]
.ml.registry.log.metric[::;::;"metricModel";1 0;`func2;10.2]
.ml.registry.log.metric[::;::;"metricModel";1 0;`func3;9]
.ml.registry.log.metric[::;::;"metricModel";1 0;`func3;11.2]

.ml.registry.new.experiment

Generate a new experiment within an existing registry

.ml.registry.new.experiment[folderPath;experimentName;config]

Where

argument type content
folderPath string, :: location of the registry; if null, current directory
experimentName string name of the experiment to be located under the namedExperiments folder to be populated by new models associated with the experiment
config dictionary, :: any additional configuration needed for initializing the experiment (currently unused)

returns an updated config dictionary containing relevant registry paths. If the registry does not exist it will be created.

In the cloud

When generating a new registry within the context of cloud vendor interactions the folderPath variable is unused and the registry location is assumed to be the storage location provided on initialization.

// Create an experiment 'test' in a registry location in 'pwd'
q).ml.registry.new.experiment[::;"test";::];

// Create an experiment 'new_test' in a registry located at a different location
q)system"mkdir -p test/folder/location"
q).ml.registry.new.experiment["test/folder/location";"new_test";::];

.ml.registry.new.registry

Generate a new registry

.ml.registry.new.registry[folderPath;config]

Where:

  • folderPath (string) indicates the location for the registry; or generic null (::) for the current directory
  • config is any additional configuration needed for initializing the registry (dictionary or generic null) – currently unused

returns a config dictionary containing relevant registry paths.

In the cloud

When generating a new registry within the context of cloud vendor interactions the folderPath variable is unused and a new registry is created at the storage location indicated.

// Generate a registry in 'pwd'
.ml.registry.new.registry[::;::]

// Create a folder and generate a registry there
system"mkdir -p test/folder/location"
.ml.registry.new.registry["test/folder/location";::]

.ml.registry.set.model

Add a new model to the ML Registry

.ml.registry.set.model[folderPath;model;modelName;modelType;config]

Where

argument type content
folderPath string, :: folder path where the registry is to be located; if null, the current directory
model embedPy, dictionary, function, projection, symbol, string the model to be saved to the registry
modelName string name to be associated with the model
modelType string type of model being saved, one of q, graph, sklearn, keras, python, torch or theano
config dictionary any additional configuration needed for initializing the model

adds the model to the registry and returns a generic null. If the registry does not exist it will be created.

model

The model argument defines the item to be saved to the registry and used as the model when retrieved. This can be an embedPy object defining an underlying Python model, a q function, projection, or dictionary; or a symbol pointing to a model saved to disk.

Models can be added under the following qualifying conditions

model type saved file type qualifying conditions
q q-binary Model must be a q projection or function; or a dictionary with a predict key.
Python pickled file The model must be saved using joblib.dump.
Sklearn pickled file The model must be saved using joblib.dump and contain a predict method i.e. is a fit scikit-learn model.
Keras HDF5 file The model must be saved using the save method provided by Keras and contain a predict method i.e. is a fit Keras model.
PyTorch pickled file/jit The model must be saved using the torch.save functionality.
Theano pickled file The model must be saved using joblib.dump. Users should also save any files used in the generation of the model using the code option within config described below in order to ensure all methods required for the model are provided.

When adding a model from disk the ability for the model to be loaded into the current process will be validated to ensure the model can be loaded into a q process and is not being added in a way that will corrupt the registry.

Further conditions:

  • A q function or projection must take a single argument, the data to be used as a prediction entity
  • A q dictionary
    • must have a predict key whose value is a q function or projection as above
    • may have an update key whose value is a q function or projection taking feature and target data used to update the model (retrieval of the update functions can be configured for use in supervised and unsupervised use-cases)
  • In Python, Sklearn, Keras, PyTorch or Theano models, functions used for prediction must accept one parameter: the data to be passed to the model as a matrix to perform a prediction.

Scikit-learn models are also supported for use as update models, namely on retrieval of the models using .ml.registry.get.update when this model has been fit and contains the partial_fit method for example: sklearn.linear_model.SGDClassifier.

config

The config variable within the .ml.registry.set.model function is used extensively within the code to facilitate advanced options within the registry code. The following keys in particular are supported for more advanced functionality.

Examples

key type Description
experimentName string The experiment to which a model being added to the registry is to be associated. If this key is not provided then the model will be set within the unnamedExperiments section of the registry.
data any If provided with data as a key the addition of the model to the registry will also attempt to parse out relevant statistical information associated with the data for use within deployment of the model.
requirements boolean, string[][], symbol Add Python requirements information associated with a model, this can either be a boolean 1b indicating use of pip freeze, a symbol indicating the path to a requirements.txt file or a list of strings defining the requirements to be added.
major boolean Is the incrementing of a version to be 'major' i.e. should the model be incremented from 1 0 to 2 0 rather than 1 0 to 1 1 as is default.
majorVersion long What major version is to be incremented? By default we increment major versions based on the maximal version within the registry, however users can define the major version to be incremented using this option.
code symbol, symbol[] Reference to the location of any files *.py/*.p/*.k or *.q files. These files are then loaded automatically on retrieval of the models using the *.get.* functionality.
axis boolean Should the data when passed to the model be 'vertical' or 'horizontal' i.e. should the data be retrieved from a table in flip value flip (0b) or value flip (1b) format. This allows flexibility in model design.
supervise string[] List of metrics to be used for supervised monitoring of the model.

Add a vanilla model to a registry in pwd:

.ml.registry.set.model[::;{x};"model";"q";::]

Add an Sklearn model to a registry:

skldata:.p.import`sklearn.datasets
blobs:skldata[`:make_blobs;<]
dset:blobs[`n_samples pykw 1000;`centers pykw 2;`random_state pykw 500]
skmdl :.p.import[`sklearn.cluster][`:AffinityPropagation]
  [`damping pykw 0.8][`:fit]dset 0
.ml.registry.set.model[::;skmdl;"skmodel";"sklearn";::]

Generate a major version of the model within the registry:

.ml.registry.set.model[::;{x+1};"model";"q";enlist[`major]!enlist 1b]

Associate some Python requirements with the next version of the Sklearn model:

requirements:enlist[`requirements]!enlist ("scikit-learn";"numpy")
.ml.registry.set.model[::;skmdl;"skmodel";"sklearn";requirements]

Add a q model saved to disk (assume running from the root of the registry repo):

.ml.registry.set.model[::;`:examples/models/qModel;"qModel";"q";::]

.ml.registry.set.parameters

Generate a JSON file containing parameters to be associated with a model

These parameters define any information that a user believes to be important to the models generation, it may include hyperparameter sets used when fitting or information about training.

.ml.registry.set.parameters
  [folderPath;experimentName;modelName;version;paramName;params]

Where

argument type content
folderPath string, :: folder path of the registry; if null, current directory
experimentName string, :: name of an experiment from which to retrieve a model; if modelName null, the newest model within this experiment; if modelName and experimentName both null, the newest model within the unnamedExperiments section
modelName string, :: name of the model (string) to be retrieved; if null, the newest model associated with the experiment
version long[], :: specific version of a named model: major and minor version numbers as a pair of longs; if null, the newest model
paramName string, symbol name of the parameter to save
params dictionary, table, string parameters to save to file

generates a JSON file containing parameters to be associated with the model, and returns a null.

In the cloud

When generating a new registry within the context of cloud vendor interactions the folderPath variable is unused and the registry location is assumed to be the storage location provided on initialization.

Add a model to the registry:

.ml.registry.set.model[::;{x+2};"mymodel";"q";::]

Save a dictionary parameter associated with a model mymodel:

.ml.registry.set.parameters[::;::;"mymodel";1 0;"paramFile";`param1`param2!1 2]

Save a list of strings as parameters associated with a model mymodel:

.ml.registry.set.parameters[::;::;"mymodel";1 0;"paramFile2";("value1";"value2")]