UDA Creation Quickstart
This page provides a quickstart guide to creating a UDA that counts the number of records in a table within a specified timestamp range for each unique combination of values across a given set of columns. To help validate the UDA, you can replicate the query using the getData API.
The UDA creation process consists of the following steps:
-
Prerequisites: Ensure you meet all the necessary prerequisites.
-
Develop the UDA: Develop the UDA ensuring your code is modular, efficient, and well-documented. Validate it with different datasets to ensure it performs as expected in various scenarios, using the test results to refine and optimize the UDA for performance and accuracy.
-
Add the UDA to a package: Arrange all necessary files, including the UDA code, dependencies, metadata, and registration in a structured manner in a package and specify required configuration settings, such as environment variables, to ensure the UDA functions correctly after deployment.
-
Test on a staging environment: Deploy the UDA in a staging environment to test it integrates well with other components.
-
Deploy to Production: Deploy the package to your production environment.
Refer to the UDA Configuration guide for more details.
By the end of this quickstart the UDA described here will be queryable on your database
The UDA counts the number of records in a table within a timestamp for each unique combination of values in a specified set of columns.
// Query function.
// Custom count by UDA.
// @desc Counts number of entries by specified columns.
// @param table {symbol} Table name.
// @param startTS {timestamp} Start time (inclusive).
// @param endTS {timestamp} End time (exclusive).
// @param byCols {symbol|symbol[]} Column(s) to count by.
// @return {dict|table} The `byCols` dictionary and table payload.
.exampleuda.countByQuery:{[table;startTS;endTS;byCols]
bc:bc!bc:(),byCols; / Convert byCols into dictionary for select table
agg:enlist[`x]!enlist(count;`i); / Create aggregation to get count of records
res:.kxi.selectTable`table`startTS`endTS`groupBy`agg!(table;startTS;endTS;bc;agg)); / Select data
.kxi.response.ok(byCols;res) / Return both the table result and the by clause for easy aggregation
}
// Aggregation function.
// @desc Takes partial results and sums the counts by the groupby dictionary returned from DAPs.
// @param res {list[][]} Partial result from all DAPs. This includes `byCols` and payload.
.exampleuda.countByAgg:{[res]
bc:first first res; / Get the groupBy columns
t:raze last each res; / Get table data from each DAP and convert to one table
res:?[t;();bc;enlist[`cnt]!enl(sum;`x)] / Functional form of 'select sum x by {byCols} from t'
.kxi.response.ok res / Return result with success codes
}
// Metadata.
metadata:.kxi.metaDescription["Custom UDA - does a count by."],
.kxi.metaParam[`name`type`isReq`description!(`table;-11h;1b;"Table name.")],
.kxi.metaParam[`name`type`isReq`description!(`byCols;11 -11h;1b;"Column(s) to count by.")],
.kxi.metaParam[`name`type`isReq`description!(`startTS;-12h;1b;"Start time (inclusive).")],
.kxi.metaParam[`name`type`isReq`description!(`endTS;-12h;1b;"End time (exclusive).")],
.kxi.metaReturn`type`description!(98h;"Count by specified columns.");
// Registration.
.kxi.registerUDA `name`query`aggregation`metadata!(`.exampleuda.countBy;`.exampleuda.countByQuery;`.exampleuda.countByAgg;metadata);
1. Prerequisites
Before trying to create this UDA ensure the following prerequisites are met:
-
You have access to a kdb Insights Enterprise instance.
-
The
kxiCLI is installed and configured on your system. -
You are familiar with how packaging works in kdb Insights Enterprise. A solid understanding of packaging is essential for understanding how UDAs are packaged and deployed. Refer to the Packaging Overview documentation for more information.
2. Develop
In this section, you define the UDA code, test it using the subway table with the direction_id and route_long_name columns, and run it against the current date.
-
Ensure a database is deployed.
-
Open a code editor that can access a specific DAP process while testing your UDA:
-
Ensure a connection to your DAP process is connected. Refer to My q Connections
-
Create a new q file
-
Select the appropriate connection from the Connection dropdown
-
Set the Target to a DAP process that is expected to contain data
-
-
Define the query function:
-
Start by reading the raw data from a single DAP and aggregating it into the count of rows per unique combination of selected columns:
byCols: `column1`column2; / set the columns to groupby bc:bc!bc:(),byCols; / Convert byCols into dictionary for select table agg:enlist[`x]!enlist(count;`i); / Create aggregation to get count of records .kxi.selectTable`table`startTS`endTS`groupBy`agg!(`table1;.z.D;.z.P;bc;agg) / Select data from the subway table with today's date- The code utilizes the helper function
.kxi.selectTableto collect the required data from the specified table and time range. We recommend this method when querying the data rather thanqSQLas this handles late data and supports Row-level Entitlements.
- The code utilizes the helper function
-
Execute the query
-
Check the code returns the correct number of rows per unique combination of values in the selected columns and table.
-
Test the code with different columns.
-
-
Convert the query code into a function and call the function to test that it returns the same data as in the previous step:
.exampleuda.countByQuery:{[table;startTS;endTS;byCols] bc:bc!bc:(),byCols; / Convert byCols into dictionary for select table agg:enlist[`x]!enlist(count;`i); / Create aggregation to get count of records res:.kxi.selectTable`table`startTS`endTS`groupBy`agg!(table;startTS;endTS;bc;agg); / Select data .kxi.response.ok(byCols;res) / Return both the table result and the by clause for easy aggregation }; //execute the query function .exampleuda.countByQuery[`table1;.z.D;.z.P;`column1`column2]Points of interest
-
The previously hard coded values have been added as parameters in the function call.
-
The response line utilizes
.kxi.response.okhas been addedwhich includes the data and any parameters that need to be passed to the aggregation function.
-
Execute the query.
This returns the data in the standard API response format including a header and results. This is helpful if there are errors as it provides the response codes.
-
Once you have confirmed there are not errors you can update the last line temporarily to only display the data, as follows:
.exampleuda.countByQuery[`subway;.z.D;.z.P;`direction_id`route_long_name][1][1]
-
When building your own query functions refer to the testing page for further details on testing and debugging your code.
-
Define the aggregation:
You can use the
qSQLAPI against the Scratchpad to test your aggregation function as this allows you to execute the query function against all the DAPs and override the aggregation with your own code.-
Switch to a terminal to execute a REST request:
-
Type the following code into the console:
START=$(date "+%Y.%m.%dD00:00:00.000000000") END=$(date "+%Y.%m.%dD23:59:59.999999999") QUERY=' .exampleuda.countByQuery:{[table;startTS;endTS;byCols] bc:bc!bc:(),byCols; agg:enlist[`x]!enlist(count;`i); res:.kxi.selectTable`table`startTS`endTS`groupBy`agg!(table;startTS;endTS;bc;agg); (byCols;res) }; .exampleuda.countByQuery[`subway;.z.D-1;.z.P;`direction_id`route_long_name] ' AGG=' {[partials] bc:first first partials; t:raze last each partials; ?[t;();bc!bc;enlist[`cnt]!enlist(sum;`x)] } ' DATA=`curl -s -X POST "${INSIGHTS_HOSTNAME}/servicegateway/kxi/qsql" \ -H "Content-Type: application/json" \ -H "Accept: application/json" \ -d "$(jq --arg QUERY "$QUERY" --arg AGG "$AGG" -n \ '{ query : $QUERY, agg : $AGG, }' | jq -cr .)"` echo $DATA | jq [.payload]The the query function defined in the previous step has been modified as follows:
-
The query function and its execution have been assigned to a string variable to be passed to the
qSQLcall -
The
.kxi.response.okline has been commented out asqSQLdoes not support response headers.
Aggregation details
-
The group by column information is extracted from the first
partial -
The
razeoperator is used to combine thepartialsinto one table of data, before the group by is applied again to aggregate thepartialsinto one set of results. -
The last line sums the counts from each partial together
-
There is a single parameter defined for the aggregation function which is a set of
partialsreturned from the query function
-
-
Expand the time range, if necessary, to ensure more than one tier is targeted with the query function
-
Click Run.
-
Check the code returns the correct data.
Info
In this quickstart we are testing both the query and aggregation functions before we deploy the package. When developing your aggregation function, if you prefer, you can deploy the query function and then execute it directly, rather than passing the function definition into the
qSQLcall.-
Check this code works as expected for different combinations of columns in different tables before proceeding.
To assist with validation of this quickstart there is a
getDataequivalent to this UDACreate a new q file and run this q code:
args:`table`startTS`endTS`groupBy`agg!(`table`;.z.P-10000D;.z.P;enlist `column1,`column2;enlist `cnt`count`column1) .com_kx_edi.uda[args;"kxi/getData"]
-
-
Convert the aggregation code into a function:
///Define the aggregation .exampleuda.countByAgg:{[partials] bc:first first partials; / Get the groupBy columns t:raze last each partials; / Get table data from each DAP and convert to one table ?[t;();bc!bc;enlist[`cnt]!enlist(sum;`x)]; / Functional form of parse'select sum x by {byCols} from t' .kxi.response.ok res / Return result with success codes };Points of interest
-
Added a namespace to the function name
-
Added the response line which utilizes
.kxi.response.okreturning a table of data.
-
If you wish to add log entries to your UDA refer to Adding logging to UDAs for more information.
3. Add to a package
Now that you have defined the functions, you need to add them to the package along with all the necessary information to register the UDA and ensure it can be queried.
Note
This quickstart adds the UDA directly to the insights-demo package. If you wish your UDA to be stored in a code-only package, refer to Loading code-only packages into a database for more details.
-
Pull the
insights-demofrom kdb Insights Enterprise to ensure it is accessible to your kdb Insights CLI.Pull
insights-demousing thekxi pm pullcommand:kxi pm pull insights-demoExport
insights-demolocally:-
Export the package as a
.kxifile. -
Use the
kxi package unpackcommand to extract the.kxifile into a package folder structure.kxi package unpack insights-demo-1.0.0.kxi
The code above assumes the package version is
1.0.0The package folder structure looks like this:
└──insights-demo ├── databases │ └── insights-demo │ ├── shards │ │ └── insights-demo-shard.yaml │ └── tables ├── init.q ├── manifest.yaml -
-
Create a new folder
srcin the package folder.
Add the query function
-
Add a
da.qfile in thesrcfolder to hold the query function definition.The package folder structure now looks like this:
└──insights-demo ├── databases │ └── insights-demo │ ├── shards │ │ └── insights-demo-shard.yaml │ └── tables ├── init.q ├── manifest.yaml └── src └── da.q -
Add the following code to the
da.qfile:// @desc Counts number of entries by specified columns. // @param table {symbol} Table name. // @param startTS {timestamp} Start time (inclusive). // @param endTS {timestamp} End time (exclusive). // @param byCols {symbol|symbol[]} Column(s) to count by. // @return {dict|table} The `byCols` dictionary and table payload. .exampleuda.countByQuery:{[table;startTS;endTS;byCols] bc:bc!bc:(),byCols; / Convert byCols into dictionary for select table agg:enlist[`x]!enlist(count;`i); / Create aggregation to get count of records res:.kxi.selectTable`table`startTS`endTS`groupBy`agg!(table;startTS;endTS;bc;agg); / Select data .kxi.response.ok(byCols;res) / Return both the table result and the by clause for easy aggregation }
Metadata
Define metadata for the UDA to describe its purpose, parameters, and return values. This is used by getMeta once the UDA is registered in the next step.
Add the metadata to the da.q file:
// Define metadata.
metadata:.kxi.metaDescription["Custom UDA - does a count by."],
.kxi.metaParam[`name`type`isReq`description!(`table;-11h;1b;"Table name.")],
.kxi.metaParam[`name`type`isReq`description!(`byCols;11 -11h;1b;"Column(s) to count by.")],
.kxi.metaParam[`name`type`isReq`description!(`startTS;-12h;1b;"Start time (inclusive).")],
.kxi.metaParam[`name`type`isReq`description`default!(`endTS;-12h;0b;"End time (exclusive).";.z.P)],
.kxi.metaReturn`type`description!(98h;"Count by specified columns.");
-
Each parameter is defined using
.kxi.metaParam. Adding each parameter to the metadata is preferable to using a dictionary as it ensures those querying the UDA can identify the parameters required. -
endTShas been set as an optional parameter and defaulted to the current time. This is done by settingparam.isReqto0band adding thedefaultargument which is set to the current timestamp. -
The data returned from this UDA is a table therefore the return type is set to
98h -
In this example
byColscan be a symbol or list of symbols, showing thattypecan accept multiple values. -
The Resource Coordinator routes only to DAPs and Aggregators that have the UDA registered.
Note
When dedicator aggregators are used, you no longer need to add
scopeas a mandatory field and set theassemblykey to the package name. -
To ensure the routing to the appropriate partitions
tableandstartTSdistinguished parameters have been set as mandatory fields in the metadata.
Refer to Metadata builders for more details.
Registration function
Register the UDA with the appropriate names for query and aggregation functions, and provide the metadata to ensure the UDA is included in any calls to getMeta and can be queried through the Service Gateway.
Add the registration to the da.q file:
.kxi.registerUDA `name`query`aggregation`metadata!(`.exampleuda.countBy;`.exampleuda.countByQuery;`.exampleuda.countByAgg;metadata);
-
This registration is done using the
.kxi.registerUDAfunction. -
The query and aggregation functions along with the UDA itself have been defined as part of a namespace as this is best practice.
-
If your UDA does not include an aggregation function and you chose to use the
razeoperator, the aggregation parameter can be omitted as follows:.kxi.registerUDA `name`query`metadata!(`.exampleuda.countBy;`.exampleuda.countByQuery;metadata);
Add the aggregation function
-
Add an
agg.qfile in thesrcfolder to hold the aggregation function definition.The package folder structure now looks like this:
└──insights-demo ├── databases │ └── insights-demo │ ├── shards │ │ └── insights-demo-shard.yaml │ └── tables ├── init.q ├── manifest.yaml └── src ├── da.q └── agg.q -
Add the following code to the
agg.qfile:// @desc Counts number of entries by specified columns. // @param table {symbol} Table name. // @param startTS {timestamp} Start time (inclusive). // @param endTS {timestamp} End time (exclusive). // @param byCols {symbol|symbol[]} Column(s) to count by. // @return {dict|table} The `byCols` dictionary and table payload. .exampleuda.countByQuery:{[table;startTS;endTS;byCols] bc:bc!bc:(),byCols; / Convert byCols into dictionary for select table agg:enlist[`x]!enlist(count;`i); / Create aggregation to get count of records res:.kxi.selectTable`table`startTS`endTS`groupBy`agg!(table;startTS;endTS;bc;agg); / Select data .kxi.response.ok(byCols;res) / Return both the table result and the by clause for easy aggregation } // Define metadata. metadata:.kxi.metaDescription["Custom UDA - does a count by."], .kxi.metaParam[`name`type`isReq`description!(`table;-11h;1b;"Table name.")], .kxi.metaParam[`name`type`isReq`description!(`byCols;11 -11h;1b;"Column(s) to count by.")], .kxi.metaParam[`name`type`isReq`description!(`startTS;-12h;1b;"Start time (inclusive).")], .kxi.metaParam[`name`type`isReq`description`default!(`endTS;-12h;0b;"End time (exclusive).";.z.P)], .kxi.metaReturn`type`description!(98h;"Count by specified columns."); .kxi.registerUDA `name`query`aggregation`metadata!(`.exampleuda.countBy;`.exampleuda.countByQuery;`.exampleuda.countByAgg;metadata);
Points of interest
-
The
metadatadeclaration andregisterAPIfunction must be included in both files as they ensure the process registers its function with the corresponding Resource Coordinator for use by the system. -
Having the query and aggregation functions in different files ensures each process only loads the function is needs to register.
Refer to UDA registration for more details.
Update the package
-
Update the package to load the UDA, by including entrypoints:
To ensure that the UDA is loaded by the Data Access and Aggregator processes specify entrypoints labeled
data-accessandaggregatorin the package containing the UDA using thekxi package addcommand.kxi package add --to insights-demo entrypoint --name aggregator --path src/uda.q kxi package add --to insights-demo entrypoint --name data-access --path src/uda.qOnce the commands are executed the following
entrypointssection is included in the package manifest file:This updates the
entrypointssection in themanifest.yamlfile of the package as follows:entrypoints: default: init.q data-access: src/da.q aggregator: src/agg.q
Note
The entrypoints only reference the files that include code a specific process needs to load.
4. Test on a staging environment
Deploy the package on a staging environment to ensure you the UDA performs as expected.
Ensure the package is loaded
To complete the deployment of the UDA, you must load it into the processes that utilize it. This is done by setting environment variables.
Set the necessary environment variables for each component to locate and load the package:
Each component loading custom code from a package must have the KXI_PACKAGES and KX_PACKAGE_PATH environment variables set.
env:
- name: KXI_PACKAGES
value: "uda-package"
- name: KX_PACKAGE_PATH
value: "/opt/kx/packages"
Mount the package
Mount the package as a volume to the folder specified in KX_PACKAGE_PATH.
Use -v to supply a volume:
docker run -e KX_PACKAGE_PATH=/opt/kx/packages\
-e KXI_PACKAGES="uda-package"\
-v /path/to/package:/opt/kx/packages
Set volumes and environment:
services:
rdb:
image: dap
command: -p 5000
environment:
- KXI_PACKAGES=uda-package:1.0.0
- KX_PACKAGE_PATH=/opt/kx/packages/
volumes:
- /path/to/package:/opt/kx/packages
Mount a volume under the container:
hostPath is used an example. This may be a persistent volume of any type.
spec:
spec:
containers:
- name: dap
image: dap
env:
- name: KXI_PACKAGES
value: "uda-package:1.0.0"
- name: KX_PACKAGE_PATH
value: "/opt/kx/packages"
volumeMounts:
- mountPath: /opt/kx/packages
name: uda-package-mount
volumes:
- name: uda-package-mount
hostPath:
path: /opt/kx/packages
Load the UDA
You can load a UDA without restarting core components like Data Access Processes (DAPs) and Aggregators (AGGs). This allows for faster development and testing.
To load a UDA, follow the instructions below:
-
Expose the DAPs and Aggregator ports in
compose.yaml. For example:kxi-da: ports: - 5081:5081 - 5082:5082 - 5083:5083 kxi-agg: ports: - 5060:5060Ensure these ports are exposed to the client through port-forwarding before running these
curlcommands -
Define the curl endpoints. For example:
export PKG=uda-package export DAP=data-access export AGG=aggregator curl -X POST "http://localhost:5081/packages/post/load?package=$PKG&entry=$DAP" curl -X POST "http://localhost:5082/packages/post/load?package=$PKG&entry=$DAP" curl -X POST "http://localhost:5083/packages/post/load?package=$PKG&entry=$DAP" curl -X POST "http://localhost:5060/packages/post/load?package=$PKG&entry=$AGG" -
Confirm the UDA is available using one of the methods below:
It may take several minutes for the updates to become available.
Call the following command:
curl -X POST --header "Content-Type: application/json"\ --header "Accepted: application/struct-text"\ "https://${INSIGHTS_HOSTNAME}/servicegateway/kxi/getMeta"You should see
exampleuda.countByin the data returned.If the UDA is not available after several minutes
-
If the deployment fails, review error messages and logs to identify the cause. Common issues include missing dependencies, incorrect configurations, or permission errors.
-
If necessary, roll back the deployment to the last known stable state to avoid disruptions in production.
-
Identify and troubleshoot issues, correct any errors, and attempt the deployment again. Refer to the Troubleshooting guide for more details.
-
-
Once the UDA available, query it to test that it produces the expected results:
-
Choose the appropriate parameter values based on what you wish to return
startTS=$(date -u '+%Y.%m.%dD%H:00:00') endTS=$(date -u '+%Y.%m.%dD%H:%M%:%S') table="table1" columns="column1,column2" curl -X POST --header "Content-Type: application/json"\ --header "Accepted: application/struct-text"\ --data "{\"startTS\": \"$startTS\", \"endTS\": \"$endTS\", \"table\": \"$table\", \"byCols\": \"$columns\"}"\ "https://${INSIGHTS_HOSTNAME}/servicegateway/exampleuda/countBy"Refer to REST for more details from using the UDAs through REST.
Refer to the example UDAs documentation for more RESTful UDA query examples.
-
If there are any issues with calling the UDA refer to the Troubleshooting guide for more details.
5. Deploy to production
Once you are happy with the UDA you can deploy the updated package to production as above.
Next steps
- Refer to UDA configuration for more details on UDAs.
- Refer to UDA examples for more examples of User Defined Analytics.