UDA Configuration

This page explains how to create and configure a user-defined analytic (UDA) that users can query.

Refer to the Overview for details on why you would create a UDA and the Best practices for coding guidelines.

A UDA consists of the following:

Query function - reads data from a Data Access Process (DAP) and performs transformations on the data.
Aggregation function - combines the results, known as partials, from executing the query function on each DAP to produce the final result, which is then returned to the user.
Metadata - describes its purpose, parameters, and return values to ensure the user can retrieve information about the UDA using getMeta.
Registration function - pulls together the query, aggregation and metadata allow the system to identify the UDA.
Package Entrypoints and the KXI_PACKAGES environment variable - are used to determine whether a particular process loads the UDA on startup.

Refer to the testing page for further details on testing and debugging your code.

The remainder of this page outlines best practices for developing and deploying a UDA.

1. Prerequisites

The following prerequisites must be met before creating a UDA:

You have access to a kdb Insights instance.
The kxi CLI is installed and configured on your system.
You are familiar with how to create and edit packages using the kdb Insights CLI. A solid understanding of packaging is essential for understanding how UDAs are packaged and deployed. Refer to the Packaging Overview documentation for more information.

2. Develop

When developing a UDA you must ensure:

Your code is modular, efficient, and well-documented aligning with the Best practices. Validate it with different datasets to ensure it performs as expected in various scenarios, using the test results to refine and optimize the UDA for performance and accuracy.
The database you wish to query is accessible.
You can connect to a DAP process and to develop you code:

You can use the kdb VS Code Extension as follows:
1. Add a Connection to a DAP process on your kdb Insights deployment.
  1. Select the CONNECTIONS panel and either:
    - Click Add Connection or the +
    Note
    
    To add a connection you need a folder open in VS Code.
  2. Add a My q connection
2. Right click on the new connection in the CONNECTIONS panel and select Connect server.
3. Create a new file of type Source file or KX Notebook
  1. Select the appropriate connection from the Connection dropdown
To improve code separation and ensure each processes only loads the functions it needs you can store the query and aggregation functions in different files.
Important

You must ensure:
- The metadata and the registerAPI function are added to both files, as they ensure the process registers its function for use by the system.
- The data-access and aggregator entrypoints in the package manifest file reference the appropriate file.
When querying a UDA, the Resource Coordinator routes only to DAPs and Aggregators that have the UDA registered.
Use distinguished parameters such as table, startTS, and endTS to route to the appropriate partitions. You can set these as mandatory fields in the metadata to enforce them when querying.

Recommendations:

You should comment you code, especially any function definitions, ensuring you describe:
- Its purpose
- Expected values and forms of arguments
- Any return value(s)
Have your code files in one or more sub folders of the package simplifies the package.
Define the UDA and its functions in a namespace. This keeps the code clean and functions can be shared within the namespace.

Additional information:

You can use the same query or aggregation function in multiple UDAs.

You can load other files with the kxi.packages commands.

.kxi.packages.load["nested-package"] //Change KX_PACKAGE_HOME for relative file load
.kxi.packages.file.load["nested.q"]

Refer to the testing page for further details on testing and debugging your code.

Refer to the Quickstart and the UDA examples for examples of UDAs.

Query function

The query function reads the raw data from a single DAP and transforms it into a set of results. The results from each DAP targeted by the query, known as partials, is passed to the aggregator to be combined and returned to the user.

When developing a query function you must ensure:

The function arguments are a list of values or a dictionary of keys named args. We recommend you provide a list of values as this allows users to interrogate the parameter details.
- If you do use a dictionary of keys you must name the dictionary of keys args. Any other name causes the DAP to treat the function as a one parameter function, which ignores the auto-casting of REST parameters.
- If you need more than 8 arguments, you must use a dictionary when defining the function.
  
  Note
  
  There is no such limitation for the metadata, therefore we recommend that each parameter in the dictionary is defined in the metadata. The system casts the parameters to the correct type, allowing users to clearly identify their data types and how to set the values.
With memory mapped table types such as basic or splayed, the query function must include table as a parameter. This ensures correct routing to a single DAP for data requests and prevents duplicated results in the response.

Recommendations:

Perform as much aggregation as possible in the DAPs to help reduce the amount of memory used and data transferred to the aggregators.
Utilize the helper function .kxi.selectTable to collect the required data from the specified table and time range.

Note

We recommend this method when querying the data rather than qSQL as this handles late data.
Wrap the response with the helper function .kxi.response.ok to indicate successful execution. This must include the data and any parameters that need to be passed to the aggregation function. For examples of the required response shape, refer to the generating a response header section.

Additional information:

If an aggregation function is not defined, the query function must return a table. This ensures the default query operator, raze can successfully combine the results, referred to as partials from each of the DAPs.
If you have between 1 and 8 parameters adding each parameter separately, rather than using a dictionary, improves code quality.
Once the query function definition is complete you can deploy it without the aggregation function if you wish. The raze operator will be used until an aggregation function.

When building your own query functions you can test and iterate by referring to the testing details.

Refer to the Query function section of the quickstart for an example of how to build the query function.

Aggregation function

The aggregation function runs on the Aggregator and combines the partials, obtained by executing the query function on each DAP.

Important

You do not need to specify an aggregation function in your UDA if you only require the raze operator to combine the partials.

When developing an aggregation function you must ensure:

The aggregation function must only have a single argument which is a list of results from the query functions on each DAP.

Recommendations:

You can use the raze operator in the aggregation to combine the partials into one table of data, before doing further bespoke aggregation.
Wrap the response with the helper function .kxi.response.ok to indicate successful execution and return the results.

Additional information:

You can use the qSQL API to test your aggregation function as this allows you to execute the query function against all the DAPs and override the aggregation with your own code.
- Assigned the query function and its execution to a string variable to be passed as the query parameter.
- You cannot use .kxi.response.ok in the query function passed to qSQL as it does not support response headers.
See this example outline:
```
///Define a string and set it to the query function and its execution
query: ".queryFn:{[params]

    ///query code

    };
    .exampleuda.queryFn[`params-value]";

///Define the aggregation function
agg:{[partials]

    ///agg code

    };

// run a distributed qSQL API call on a specific database with the DAP and AGG function
.com_kx_edi.qsql[`query`agg!query;agg)]
```
Note

If you have already deployed the query function you dont need to define it in the query parameter.
If your aggregation function fails you can have the aggregator return the partial results of a failed aggregation back to the caller. Refer to Aggregation testing for more details.

Refer to the Aggregation function section of the quickstart for an example of how to build the aggregation function.

Registration

Each DAP and Aggregator that loads the UDA needs to register its function with the system to ensure the Service Gateway and Resource Coordinators know it is available for use.

Important

The registration details defined below need to be part of every file that includes a UDA query or aggregation function as the DAPs and the Aggregators both need to register their functions with the system.

Metadata

Define metadata for the UDA to describe its purpose, parameters, and return values. This is used by getMeta once the UDA is registered in the next step.

Metadata consists of:
- A description .kxi.metaDescription
- Return details using .kxi.metaReturn.
- Details of each parameter using .kxi.metaParam
  - You can use the param.isReq argument of .kxi.metaParam to choose whether the parameter is Mandatory 1b and optional 0b. For optional parameters you can set a default value by adding the default argument.
  - The param.type argument of .kxi.metaParam can accept multiple values. For example, you can set a parameter to a symbol or list of symbols, allowing one or more string values to be provided in a comma separated list.
To ensure the correct query routing, set distinguished parameters such as table, startTS, and endTS as mandatory fields in the metadata. This ensures the query goes only to the processes that have the UDA loaded.

Refer to Metadata builders for more details of the metadata definition.

Refer to the Metadata section of the quickstart for an example of how to build the metadata.

Registration function

The .kxi.registerUDA function registers the UDA and provides the metadata to ensure the UDA is included in any calls to getMeta and can be queried through the Service Gateway. It also ensures that the Resource Coordinator learns which DAPs and Aggregators have the UDA loaded and when querying a UDA only routes to the DAPs and Aggregators that have the UDA registered.

.kxi.registerUDA `name`query`aggregation`metadata!(`.namespace.udaname;`.namespace.queryname;`.namespace.aggname;metadata);

If your UDA does not include an aggregation function or the aggregation function is is another file, the aggregation parameter can be omitted as follows:
```
.kxi.registerUDA `name`query`metadata!(`.namespace.udaname;`.namespace.queryname;metadata);
```

Refer to UDA registration for more details on the .kxi.registerUDA function.

Refer to the Registration function section of the quickstart for an example of how to build the registration function.

3. Adding to a package

Once the UDA is defined, it must be added to a package to facilitate deployment. Additionally, you must set configuration settings, such as environment variables, to ensure the UDA is loaded into the appropriate processes.

Considerations when adding a UDA definition to a package:

How do I add the UDA to my package?

You can include the whole UDA definition in one file and the DAPs and Aggregators will load all the code. However we recommend you separate the query and aggregation functions into two files. This enables the DAPs and Aggregators to only load the code that is relevant to them. For example, the aggregator is unlikely to need the query function.

To register a UDA that is split across two files, each file must include the metadata and the register function. This ensures both processes register their functions with the corresponding Resource Coordinator and Service Gateway for use by the system.

Package Entrypoints

Once the UDA definition is included in a package, entrypoints must be added to ensure that when loading the package the files containing the UDA are loaded into the appropriate processes.

This is done by adding files to the data-access and aggregator entrypoints in the package containing the UDA.

The kxi package add command is used to add entrypoints. You can include multiple files by setting the path to a comma separated list of files.

The following code adds two entrypoints to the DAPs and one to the Aggregators:

kxi package add --to uda-package entrypoint --name aggregator --path agg-file1.q
kxi package add --to uda-package entrypoint --name data-access --path dap-file1.q,dap-file2.q

The result of calling these commands is that the manifest.yaml file is updated as follows:

entrypoints:
    default: init.q
    data-access: 
        - dapfile1.q
        - dapfile2.q
    aggregator: agg-file1.q

When a package is created the default entrypoint includes a reference to init.q. This is loaded into a process if a specific entrypoint is not defined. If the file defined under the default entrypoint exists it is loaded into all processes that do not have any entrypoints explicitly defined in the manifest.yaml file.

Refer to the Entrypoints function section of the quickstart for an example of how to define the entrypoints.

Package versioning

It is good practice to increase the version number after making updates to a package.

kxi package checkpoint insights-demo --bump patch

4. Test deploy

We recommend deploying the UDA to a staging environment to confirm it performs as expected before deploying to production systems.

Ensure the package is loaded

To complete the deployment of the UDA, you must load it into the processes that utilize it. This is done by setting environment variables.

Set the necessary environment variables for each component to locate and load the package:

Each component loading custom code from a package must have the KXI_PACKAGES and KX_PACKAGE_PATH environment variables set.

env:
  - name: KXI_PACKAGES
    value: "uda-package"
  - name: KX_PACKAGE_PATH
    value: "/opt/kx/packages"

Mount the package

Mount the package as a volume to the folder specified in KX_PACKAGE_PATH.

DockerDocker ComposeKubernetes

Use -v to supply a volume:

docker run -e KX_PACKAGE_PATH=/opt/kx/packages\
  -e KXI_PACKAGES="uda-package"\
  -v /path/to/package:/opt/kx/packages

Set volumes and environment:

services:
  rdb:
    image: dap
    command: -p 5000
    environment:
      - KXI_PACKAGES=uda-package:1.0.0
      - KX_PACKAGE_PATH=/opt/kx/packages/
    volumes:
      - /path/to/package:/opt/kx/packages

Mount a volume under the container:

hostPath is used an example. This may be a persistent volume of any type.

spec:
  spec:
    containers:
      - name: dap
        image: dap
        env:
          - name: KXI_PACKAGES
            value: "uda-package:1.0.0"
          - name: KX_PACKAGE_PATH
            value: "/opt/kx/packages"
        volumeMounts:
        - mountPath: /opt/kx/packages
          name: uda-package-mount
    volumes:
      - name: uda-package-mount
        hostPath:
          path: /opt/kx/packages

Hot reload

You can load a UDA without restarting core components like Data Access Processes (DAPs) and Aggregators (AGGs). This is referred to as a Hot Reload and allows for faster development and testing.

To load a UDA, follow the instructions below:

Expose the DAPs and Aggregator ports in compose.yaml. For example:
```
  kxi-da:
    ports:
      - 5081:5081
      - 5082:5082
      - 5083:5083
  kxi-agg:
    ports:
      - 5060:5060
```
Ensure these ports are exposed to the client through port-forwarding before running these curl commands

Define the curl endpoints. For example:

export PKG=uda-package
export DAP=data-access
export AGG=aggregator
curl -X POST "http://localhost:5081/packages/post/load?package=$PKG&entry=$DAP"
curl -X POST "http://localhost:5082/packages/post/load?package=$PKG&entry=$DAP"
curl -X POST "http://localhost:5083/packages/post/load?package=$PKG&entry=$DAP"
curl -X POST "http://localhost:5060/packages/post/load?package=$PKG&entry=$AGG"

Call getMeta to confirm the UDA is available

curl -X POST --header "Content-Type: application/json"\
    --header "Accepted: application/struct-text"\
    "https://${INSIGHTS_HOSTNAME}/servicegateway/kxi/getMeta"

5. Query

Once deployed to your staging environment you can query the UDA with different parameters to confirm that it behaves as expected.

Once the UDA is ready, you can query it as follows:

Choose the appropriate values for the parameters based on the data in your database:

parameter1="a"
parameter2="b"
curl -X POST --header "Content-Type: application/json"\
    --header "Accepted: application/struct-text"\
    --data "{\"parameter1\": \"$parameter1\", \"parameter2\": \"$parameter2\"}"\
    "https://${INSIGHTS_HOSTNAME}/servicegateway/namespace/uda"

Refer to the example UDAs documentation for more examples of queries.

5. Deploy to Production

Once you are happy with the UDA you can deploy the updated package to production. Ensure you follow the steps in the Test deploy section to deploy the UDA to the production system.

Refer to the Authentication documentation for instructions on granting users permissions to query the database using UDAs.

If entitlements are enforced refer to the Entitlements documentation for instructions on giving users entitlements to query the insights-demo data.

Next steps

Follow the Quickstart guide examples for more examples of User Defined Analytics.
Refer to UDA examples for more examples of User Defined Analytics.
For more information on packaging and deploying UDA packages in kdb Insights, refer to the kdb Insights Package deployment documentation.