UDA Configuration
This page explains how to create and configure a user-defined analytic (UDA) that users can query.
Refer to the Overview for details on why you would create a UDA and the Best practices for coding guidelines.
A UDA consists of the following:
-
Query function - reads data from a Data Access Process (DAP) and performs transformations on the data.
-
Aggregation function - combines the results, known as
partials, from executing the query function on each DAP to produce the final result, which is then returned to the user. -
Metadata - describes its purpose, parameters, and return values to ensure the user can retrieve information about the UDA using
getMeta. -
Registration function - pulls together the query, aggregation and metadata allow the system to identify the UDA.
-
Package Entrypoints and the
KXI_PACKAGESenvironment variable - are used to determine whether a particular process loads the UDA on startup.
Refer to the testing page for further details on testing and debugging your code.
The remainder of this page outlines best practices for developing and deploying a UDA.
1. Prerequisites
The following prerequisites must be met before creating a UDA:
-
You have access to a kdb Insights instance.
-
The
kxiCLI is installed and configured on your system. -
You are familiar with how to create and edit packages using the kdb Insights CLI. A solid understanding of packaging is essential for understanding how UDAs are packaged and deployed. Refer to the Packaging Overview documentation for more information.
2. Develop
When developing a UDA you must ensure:
-
Your code is modular, efficient, and well-documented aligning with the Best practices. Validate it with different datasets to ensure it performs as expected in various scenarios, using the test results to refine and optimize the UDA for performance and accuracy.
-
The database you wish to query is accessible.
-
You can connect to a DAP process and to develop you code:
You can use the kdb VS Code Extension as follows:
-
Add a Connection to a DAP process on your kdb Insights deployment.
-
Select the CONNECTIONS panel and either:
- Click Add Connection or the +
Note
To add a connection you need a folder open in VS Code.
-
Add a My q connection
-
-
Right click on the new connection in the CONNECTIONS panel and select Connect server.
-
Create a new file of type Source file or KX Notebook
- Select the appropriate connection from the Connection dropdown
-
-
To improve code separation and ensure each processes only loads the functions it needs you can store the query and aggregation functions in different files.
Important
You must ensure:
-
The
metadataand theregisterAPIfunction are added to both files, as they ensure the process registers its function for use by the system. -
The
data-accessandaggregatorentrypoints in the package manifest file reference the appropriate file.
-
-
When querying a UDA, the Resource Coordinator routes only to DAPs and Aggregators that have the UDA registered.
-
Use distinguished parameters such as
table,startTS, andendTSto route to the appropriate partitions. You can set these as mandatory fields in the metadata to enforce them when querying.
Recommendations:
-
You should comment you code, especially any function definitions, ensuring you describe:
-
Its purpose
-
Expected values and forms of arguments
-
Any return value(s)
-
-
Have your code files in one or more sub folders of the package simplifies the package.
-
Define the UDA and its functions in a namespace. This keeps the code clean and functions can be shared within the namespace.
Additional information:
-
You can use the same query or aggregation function in multiple UDAs.
-
You can load other files with the
kxi.packagescommands..kxi.packages.load["nested-package"] //Change KX_PACKAGE_HOME for relative file load .kxi.packages.file.load["nested.q"]
Refer to the testing page for further details on testing and debugging your code.
Refer to the Quickstart and the UDA examples for examples of UDAs.
Query function
The query function reads the raw data from a single DAP and transforms it into a set of results. The results from each DAP targeted by the query, known as partials, is passed to the aggregator to be combined and returned to the user.
When developing a query function you must ensure:
-
The function arguments are a list of values or a dictionary of keys named
args. We recommend you provide a list of values as this allows users to interrogate the parameter details.-
If you do use a dictionary of keys you must name the dictionary of keys
args. Any other name causes the DAP to treat the function as a one parameter function, which ignores the auto-casting of REST parameters. -
If you need more than 8 arguments, you must use a dictionary when defining the function.
Note
There is no such limitation for the metadata, therefore we recommend that each parameter in the dictionary is defined in the metadata. The system casts the parameters to the correct type, allowing users to clearly identify their data types and how to set the values.
-
-
With memory mapped table types such as
basicorsplayed, the query function must includetableas a parameter. This ensures correct routing to a single DAP for data requests and prevents duplicated results in the response.
Recommendations:
-
Perform as much aggregation as possible in the DAPs to help reduce the amount of memory used and data transferred to the aggregators.
-
Utilize the helper function
.kxi.selectTableto collect the required data from the specified table and time range.Note
We recommend this method when querying the data rather than
qSQLas this handles late data. -
Wrap the response with the helper function
.kxi.response.okto indicate successful execution. This must include the data and any parameters that need to be passed to the aggregation function. For examples of the required response shape, refer to the generating a response header section.
Additional information:
-
If an
aggregationfunction is not defined, thequeryfunction must return a table. This ensures the default query operator,razecan successfully combine the results, referred to aspartialsfrom each of the DAPs. -
If you have between 1 and 8 parameters adding each parameter separately, rather than using a dictionary, improves code quality.
-
Once the query function definition is complete you can deploy it without the aggregation function if you wish. The
razeoperator will be used until an aggregation function.
When building your own query functions you can test and iterate by referring to the testing details.
Refer to the Query function section of the quickstart for an example of how to build the query function.
Aggregation function
The aggregation function runs on the Aggregator and combines the partials, obtained by executing the query function on each DAP.
Important
You do not need to specify an aggregation function in your UDA if you only require the raze operator to combine the partials.
When developing an aggregation function you must ensure:
- The aggregation function must only have a single argument which is a list of results from the query functions on each DAP.
Recommendations:
-
You can use the
razeoperator in the aggregation to combine thepartialsinto one table of data, before doing further bespoke aggregation. -
Wrap the response with the helper function
.kxi.response.okto indicate successful execution and return the results.
Additional information:
-
You can use the
qSQLAPI to test your aggregation function as this allows you to execute the query function against all the DAPs and override the aggregation with your own code.-
Assigned the query function and its execution to a string variable to be passed as the
queryparameter. -
You cannot use
.kxi.response.okin the query function passed toqSQLas it does not support response headers.
See this example outline:
///Define a string and set it to the query function and its execution query: ".queryFn:{[params] ///query code }; .exampleuda.queryFn[`params-value]"; ///Define the aggregation function agg:{[partials] ///agg code }; // run a distributed qSQL API call on a specific database with the DAP and AGG function .com_kx_edi.qsql[`query`agg!query;agg)]Note
If you have already deployed the query function you dont need to define it in the query parameter.
-
-
If your aggregation function fails you can have the aggregator return the partial results of a failed aggregation back to the caller. Refer to Aggregation testing for more details.
Refer to the Aggregation function section of the quickstart for an example of how to build the aggregation function.
Registration
Each DAP and Aggregator that loads the UDA needs to register its function with the system to ensure the Service Gateway and Resource Coordinators know it is available for use.
Important
The registration details defined below need to be part of every file that includes a UDA query or aggregation function as the DAPs and the Aggregators both need to register their functions with the system.
Metadata
Define metadata for the UDA to describe its purpose, parameters, and return values. This is used by getMeta once the UDA is registered in the next step.
-
Metadata consists of:
-
A description
.kxi.metaDescription -
Return details using
.kxi.metaReturn. -
Details of each parameter using
.kxi.metaParam-
You can use the
param.isReqargument of.kxi.metaParamto choose whether the parameter is Mandatory1band optional0b. For optional parameters you can set a default value by adding thedefaultargument. -
The
param.typeargument of.kxi.metaParamcan accept multiple values. For example, you can set a parameter to a symbol or list of symbols, allowing one or more string values to be provided in a comma separated list.
-
-
-
To ensure the correct query routing, set distinguished parameters such as
table,startTS, andendTSas mandatory fields in the metadata. This ensures the query goes only to the processes that have the UDA loaded.
Refer to Metadata builders for more details of the metadata definition.
Refer to the Metadata section of the quickstart for an example of how to build the metadata.
Registration function
The .kxi.registerUDA function registers the UDA and provides the metadata to ensure the UDA is included in any calls to getMeta and can be queried through the Service Gateway. It also ensures that the Resource Coordinator learns which DAPs and Aggregators have the UDA loaded and when querying a UDA only routes to the DAPs and Aggregators that have the UDA registered.
.kxi.registerUDA `name`query`aggregation`metadata!(`.namespace.udaname;`.namespace.queryname;`.namespace.aggname;metadata);
-
If your UDA does not include an aggregation function or the aggregation function is is another file, the aggregation parameter can be omitted as follows:
.kxi.registerUDA `name`query`metadata!(`.namespace.udaname;`.namespace.queryname;metadata);
Refer to UDA registration for more details on the .kxi.registerUDA function.
Refer to the Registration function section of the quickstart for an example of how to build the registration function.
Adding to a package
Once the UDA is defined, it must be added to a package to facilitate deployment. Additionally, you must set configuration settings, such as environment variables, to ensure the UDA is loaded into the appropriate processes.
Considerations when adding a UDA definition to a package:
-
How do I add the UDA to my package?
You can include the whole UDA definition in one file and the DAPs and Aggregators will load all the code. However we recommend you separate the query and aggregation functions into two files. This enables the DAPs and Aggregators to only load the code that is relevant to them. For example, the aggregator is unlikely to need the query function.
To register a UDA that is split across two files, each file must include the metadata and the register function. This ensures both processes register their functions with the corresponding Resource Coordinator and Service Gateway for use by the system.
Package Entrypoints
Once the UDA definition is included in a package, entrypoints must be added to ensure that when loading the package the files containing the UDA are loaded into the appropriate processes.
This is done by adding files to the data-access and aggregator entrypoints in the package containing the UDA.
The kxi package add command is used to add entrypoints. You can include multiple files by setting the path to a comma separated list of files.
The following code adds two entrypoints to the DAPs and one to the Aggregators:
kxi package add --to uda-package entrypoint --name aggregator --path agg-file1.q
kxi package add --to uda-package entrypoint --name data-access --path dap-file1.q,dap-file2.q
The result of calling these commands is that the manifest.yaml file is updated as follows:
entrypoints:
default: init.q
data-access:
- dapfile1.q
- dapfile2.q
aggregator: agg-file1.q
- When a package is created the
defaultentrypoint includes a reference toinit.q. This is loaded into a process if a specific entrypoint is not defined. If the file defined under thedefaultentrypoint exists it is loaded into all processes that do not have any entrypoints explicitly defined in themanifest.yamlfile.
Refer to the Entrypoints function section of the quickstart for an example of how to define the entrypoints.
Package versioning
It is good practice to increase the version number after making updates to a package.
kxi package checkpoint insights-demo --bump patch
4. Test deploy
We recommend deploying the UDA to a staging environment to confirm it performs as expected before deploying to production systems.
Ensure the package is loaded
To complete the deployment of the UDA, you must load it into the processes that utilize it. This is done by setting environment variables.
Set the necessary environment variables for each component to locate and load the package:
Each component loading custom code from a package must have the KXI_PACKAGES and KX_PACKAGE_PATH environment variables set.
env:
- name: KXI_PACKAGES
value: "uda-package"
- name: KX_PACKAGE_PATH
value: "/opt/kx/packages"
Mount the package
Mount the package as a volume to the folder specified in KX_PACKAGE_PATH.
Use -v to supply a volume:
docker run -e KX_PACKAGE_PATH=/opt/kx/packages\
-e KXI_PACKAGES="uda-package"\
-v /path/to/package:/opt/kx/packages
Set volumes and environment:
services:
rdb:
image: dap
command: -p 5000
environment:
- KXI_PACKAGES=uda-package:1.0.0
- KX_PACKAGE_PATH=/opt/kx/packages/
volumes:
- /path/to/package:/opt/kx/packages
Mount a volume under the container:
hostPath is used an example. This may be a persistent volume of any type.
spec:
spec:
containers:
- name: dap
image: dap
env:
- name: KXI_PACKAGES
value: "uda-package:1.0.0"
- name: KX_PACKAGE_PATH
value: "/opt/kx/packages"
volumeMounts:
- mountPath: /opt/kx/packages
name: uda-package-mount
volumes:
- name: uda-package-mount
hostPath:
path: /opt/kx/packages
Hot reload
You can load a UDA without restarting core components like Data Access Processes (DAPs) and Aggregators (AGGs). This is referred to as a Hot Reload and allows for faster development and testing.
To load a UDA, follow the instructions below:
-
Expose the DAPs and Aggregator ports in
compose.yaml. For example:kxi-da: ports: - 5081:5081 - 5082:5082 - 5083:5083 kxi-agg: ports: - 5060:5060Ensure these ports are exposed to the client through port-forwarding before running these
curlcommands -
Define the curl endpoints. For example:
export PKG=uda-package export DAP=data-access export AGG=aggregator curl -X POST "http://localhost:5081/packages/post/load?package=$PKG&entry=$DAP" curl -X POST "http://localhost:5082/packages/post/load?package=$PKG&entry=$DAP" curl -X POST "http://localhost:5083/packages/post/load?package=$PKG&entry=$DAP" curl -X POST "http://localhost:5060/packages/post/load?package=$PKG&entry=$AGG" -
Call
getMetato confirm the UDA is availablecurl -X POST --header "Content-Type: application/json"\ --header "Accepted: application/struct-text"\ "https://${INSIGHTS_HOSTNAME}/servicegateway/kxi/getMeta"
5. Query
Once deployed to your staging environment you can query the UDA with different parameters to confirm that it behaves as expected.
Once the UDA is ready, you can query it as follows:
-
Choose the appropriate values for the parameters based on the data in your database:
parameter1="a" parameter2="b" curl -X POST --header "Content-Type: application/json"\ --header "Accepted: application/struct-text"\ --data "{\"parameter1\": \"$parameter1\", \"parameter2\": \"$parameter2\"}"\ "https://${INSIGHTS_HOSTNAME}/servicegateway/namespace/uda"
Refer to the example UDAs documentation for more examples of queries.
5. Deploy to Production
Once you are happy with the UDA you can deploy the updated package to production. Ensure you follow the steps in the Test deploy section to deploy the UDA to the production system.
Refer to the Authentication documentation for instructions on granting users permissions to query the database using UDAs.
If entitlements are enforced refer to the Entitlements documentation for instructions on giving users entitlements to query the insights-demo data.
Next steps
-
Follow the Quickstart guide examples for more examples of User Defined Analytics.
-
Refer to UDA examples for more examples of User Defined Analytics.
-
For more information on packaging and deploying UDA packages in kdb Insights, refer to the kdb Insights Package deployment documentation.