UDA Creation Quickstart

This page provides a quickstart guide to creating a UDA that counts the number of records in a table within a specified timestamp range for each unique combination of values across a given set of columns. To help validate the UDA, you can replicate the query using the getData API.

The UDA creation process consists of the following steps:

Prerequisites: Ensure you meet all the necessary prerequisites.
Develop the UDA: Develop the UDA ensuring your code is modular, efficient, and well-documented. Validate it with different datasets to ensure it performs as expected in various scenarios, using the test results to refine and optimize the UDA for performance and accuracy.
Add the UDA to a package: Arrange all necessary files, including the UDA code, dependencies, metadata, and registration in a structured manner in a package and specify required configuration settings, such as environment variables, to ensure the UDA functions correctly after deployment.
Test on a staging environment: Deploy the UDA in a staging environment to test it integrates well with other components.
Deploy to Production: Deploy the package to your production environment.

Refer to the UDA Configuration guide for more details.

By the end of this quickstart the UDA described here will be queryable on your database

The UDA counts the number of records in a table within a timestamp for each unique combination of values in a specified set of columns.

// Query function.
// Custom count by UDA.

// @desc Counts number of entries by specified columns.
// @param table     {symbol}            Table name.
// @param startTS   {timestamp}         Start time (inclusive).
// @param endTS     {timestamp}         End time (exclusive).
// @param byCols    {symbol|symbol[]}   Column(s) to count by.
// @return          {dict|table}        The `byCols` dictionary and table payload. 
.exampleuda.countByQuery:{[table;startTS;endTS;byCols]
    bc:bc!bc:(),byCols; / Convert byCols into dictionary for select table
    agg:enlist[`x]!enlist(count;`i); / Create aggregation to get count of records
    res:.kxi.selectTable`table`startTS`endTS`groupBy`agg!(table;startTS;endTS;bc;agg)); / Select data
    .kxi.response.ok(byCols;res) / Return both the table result and the by clause for easy aggregation
    }

// Aggregation function.
// @desc Takes partial results and sums the counts by the groupby dictionary returned from DAPs.
// @param res   {list[][]}  Partial result from all DAPs. This includes `byCols` and payload.
.exampleuda.countByAgg:{[res]
    bc:first first res; / Get the groupBy columns
    t:raze last each res; / Get table data from each DAP and convert to one table
    res:?[t;();bc;enlist[`cnt]!enl(sum;`x)] / Functional form of 'select sum x by {byCols} from t'
    .kxi.response.ok res / Return result with success codes
    }


// Metadata.
metadata:.kxi.metaDescription["Custom UDA - does a count by."],
    .kxi.metaParam[`name`type`isReq`description!(`table;-11h;1b;"Table name.")],
    .kxi.metaParam[`name`type`isReq`description!(`byCols;11 -11h;1b;"Column(s) to count by.")],
    .kxi.metaParam[`name`type`isReq`description!(`startTS;-12h;1b;"Start time (inclusive).")],
    .kxi.metaParam[`name`type`isReq`description!(`endTS;-12h;1b;"End time (exclusive).")],
    .kxi.metaReturn`type`description!(98h;"Count by specified columns.");

// Registration.
.kxi.registerUDA `name`query`aggregation`metadata!(`.exampleuda.countBy;`.exampleuda.countByQuery;`.exampleuda.countByAgg;metadata);

1. Prerequisites

Before trying to create this UDA ensure the following prerequisites are met:

You have access to a kdb Insights Database instance.
The kxi CLI is installed and configured on your system.
You are familiar with how packaging works. A solid understanding of packaging is essential for understanding how UDAs are packaged. Refer to the Packaging Overview documentation for more information.
You have a database that contains a subway partitioned table with the direction_id and route_long_name columns populated with data.

2. Develop

In this section, you define the UDA code, test it using the subway table with the direction_id and route_long_name columns, and run it against the current date.

Ensure a database is deployed.
Open a code editor that can access a specific DAP process while testing your UDA:
1. Ensure a connection to your DAP process is connected. Refer to My q Connections
2. Create a new q file
3. Select the appropriate connection from the Connection dropdown
4. Set the Target to a DAP process that is expected to contain data
Define the query function:
1. Start by reading the raw data from a single DAP and aggregating it into the count of rows per unique combination of selected columns:
```
byCols: `column1`column2; / set the columns to groupby
bc:bc!bc:(),byCols; / Convert byCols into dictionary for select table
agg:enlist[`x]!enlist(count;`i); / Create aggregation to get count of records
.kxi.selectTable`table`startTS`endTS`groupBy`agg!(`table1;.z.D;.z.P;bc;agg) / Select data from the subway table with today's date
```
  - The code utilizes the helper function .kxi.selectTable to collect the required data from the specified table and time range. We recommend this method when querying the data rather than qSQL as this handles late data and supports Row-level Entitlements.
2. Execute the query
3. Check the code returns the correct number of rows per unique combination of values in the selected columns and table.
4. Test the code with different columns.
Convert the query code into a function and call the function to test that it returns the same data as in the previous step:
```
.exampleuda.countByQuery:{[table;startTS;endTS;byCols]
    bc:bc!bc:(),byCols; / Convert byCols into dictionary for select table
    agg:enlist[`x]!enlist(count;`i); / Create aggregation to get count of records
    res:.kxi.selectTable`table`startTS`endTS`groupBy`agg!(table;startTS;endTS;bc;agg); / Select data
    .kxi.response.ok(byCols;res) / Return both the table result and the by clause for easy aggregation
    };

//execute the query function
.exampleuda.countByQuery[`table1;.z.D;.z.P;`column1`column2]
```
Points of interest
- The previously hard coded values have been added as parameters in the function call.
- The response line utilizes .kxi.response.ok has been addedwhich includes the data and any parameters that need to be passed to the aggregation function.
1. Execute the query.
  
  This returns the data in the standard API response format including a header and results. This is helpful if there are errors as it provides the response codes.
2. Once you have confirmed there are no errors you can update the last line temporarily to only display the data, as follows:
```
.exampleuda.countByQuery[`subway;.z.D;.z.P;`direction_id`route_long_name][1][1]
```

When building your own query functions refer to the testing page for further details on testing and debugging your code.

Define the aggregation:

You can use the qSQL API to test your aggregation function as this allows you to execute the query function against all the DAPs and override the aggregation with your own code.
1. Switch to a terminal to execute a REST request:
2. Type the following code into the console:
```
START=$(date "+%Y.%m.%dD00:00:00.000000000")
END=$(date "+%Y.%m.%dD23:59:59.999999999")
QUERY='
.exampleuda.countByQuery:{[table;startTS;endTS;byCols]
  bc:bc!bc:(),byCols;
  agg:enlist[`x]!enlist(count;`i);
  res:.kxi.selectTable`table`startTS`endTS`groupBy`agg!(table;startTS;endTS;bc;agg);
  (byCols;res)
};
.exampleuda.countByQuery[`subway;.z.D-1;.z.P;`direction_id`route_long_name]
'

AGG='
{[partials]
  bc:first first partials;
  t:raze last each partials;
  ?[t;();bc!bc;enlist[`cnt]!enlist(sum;`x)]
}
'

DATA=`curl -s -X POST "${INSIGHTS_HOSTNAME}/servicegateway/kxi/qsql" \
    -H "Content-Type: application/json" \
    -H "Accept: application/json" \
    -d "$(jq --arg QUERY "$QUERY" --arg AGG "$AGG" -n \
        '{
            query       : $QUERY,
            agg         : $AGG,
        }' | jq -cr .)"`

echo $DATA | jq [.payload]
```
The query function defined in the previous step has been modified as follows:
- The query function and its execution have been assigned to a string variable to be passed to the qSQL call
- The .kxi.response.ok line has been commented out as qSQL does not support response headers.
Aggregation details
- The group by column information is extracted from the first partial
- The raze operator is used to combine the partials into one table of data, before the group by is applied again to aggregate the partials into one set of results.
- The last line sums the counts from each partial together
- There is a single parameter defined for the aggregation function which is a set of partials returned from the query function
1. Expand the time range, if necessary, to ensure more than one tier is targeted with the query function
2. Click Run.
3. Check the code returns the correct data.
Info

In this quickstart we are testing both the query and aggregation functions before we deploy the package. When developing your aggregation function, if you prefer, you can deploy the query function and then execute it directly, rather than passing the function definition into the qSQL call.
1. Check this code works as expected for different combinations of columns in different tables before proceeding.
  To assist with validation of this quickstart there is a getData equivalent to this UDA
  VS Code
  Create a new q file and run this q code:
  
  args:`table`startTS`endTS`groupBy`agg!(`table`;.z.P-10000D;.z.P;enlist `column1,`column2;enlist `cnt`count`column1) .com_kx_edi.uda[args;"kxi/getData"]

Convert the aggregation code into a function:

///Define the aggregation
.exampleuda.countByAgg:{[partials]
bc:first first partials; / Get the groupBy columns
t:raze last each partials; / Get table data from each DAP and convert to one table
?[t;();bc!bc;enlist[`cnt]!enlist(sum;`x)]; / Functional form of parse'select sum x by {byCols} from t'
.kxi.response.ok res / Return result with success codes
};

Points of interest

Added a namespace to the function name
Added the response line which utilizes .kxi.response.ok returning a table of data.

If you wish to add log entries to your UDA refer to Adding logging to UDAs for more information.

3. Add to a package

Now that you have defined the functions, you need to add them to the package along with all the necessary information to register the UDA and ensure it can be queried.

Note

This quickstart adds the UDA directly to an insights-demo package.

Create a new package called insights-demo .

```bash
kxi pm init insights-demo
```

The package folder structure looks like this:

└──insights-demo
    ├── init.q
    └── manifest.yaml

Create a new folder src in the package folder.

Add the query function

Add a da.q file in the src folder to hold the query function definition.

The package folder structure now looks like this:

└──insights-demo
    ├── init.q
    ├── manifest.yaml
    └── src
        └── da.q

Add the following code to the da.q file:

// @desc Counts number of entries by specified columns.
// @param table     {symbol}            Table name.
// @param startTS   {timestamp}         Start time (inclusive).
// @param endTS     {timestamp}         End time (exclusive).
// @param byCols    {symbol|symbol[]}   Column(s) to count by.
// @return          {dict|table}        The `byCols` dictionary and table payload. 
.exampleuda.countByQuery:{[table;startTS;endTS;byCols]
    bc:bc!bc:(),byCols; / Convert byCols into dictionary for select table
    agg:enlist[`x]!enlist(count;`i); / Create aggregation to get count of records
    res:.kxi.selectTable`table`startTS`endTS`groupBy`agg!(table;startTS;endTS;bc;agg); / Select data
    .kxi.response.ok(byCols;res) / Return both the table result and the by clause for easy aggregation
    }

Metadata

Define metadata for the UDA to describe its purpose, parameters, and return values. This is used by getMeta once the UDA is registered in the next step.

Add the metadata to the da.q file:

// Define metadata.
metadata:.kxi.metaDescription["Custom UDA - does a count by."],
    .kxi.metaParam[`name`type`isReq`description!(`table;-11h;1b;"Table name.")],
    .kxi.metaParam[`name`type`isReq`description!(`byCols;11 -11h;1b;"Column(s) to count by.")],
    .kxi.metaParam[`name`type`isReq`description!(`startTS;-12h;1b;"Start time (inclusive).")],
    .kxi.metaParam[`name`type`isReq`description`default!(`endTS;-12h;0b;"End time (exclusive).";.z.P)],
    .kxi.metaReturn`type`description!(98h;"Count by specified columns.");

Each parameter is defined using .kxi.metaParam. Adding each parameter to the metadata is preferable to using a dictionary as it ensures those querying the UDA can identify the parameters required.
endTS has been set as an optional parameter and defaulted to the current time. This is done by setting param.isReq to 0b and adding the default argument which is set to the current timestamp.
The data returned from this UDA is a table therefore the return type is set to 98h
In this example byCols can be a symbol or list of symbols, showing that type can accept multiple values.
The Resource Coordinator routes only to DAPs and Aggregators that have the UDA registered.

Note

When dedicator aggregators are used, you no longer need to add scope as a mandatory field and set the assembly key to the package name.
To ensure the routing to the appropriate partitions table and startTS distinguished parameters have been set as mandatory fields in the metadata.

Refer to Metadata builders for more details.

Registration function

Register the UDA with the appropriate names for query and aggregation functions, and provide the metadata to ensure the UDA is included in any calls to getMeta and can be queried through the Service Gateway.

Add the registration to the da.q file:

.kxi.registerUDA `name`query`aggregation`metadata!(`.exampleuda.countBy;`.exampleuda.countByQuery;`.exampleuda.countByAgg;metadata);

This registration is done using the .kxi.registerUDA function.
The query and aggregation functions along with the UDA itself have been defined as part of a namespace as this is best practice.
If your UDA does not include an aggregation function and you chose to use the raze operator, the aggregation parameter can be omitted as follows:
```
.kxi.registerUDA `name`query`metadata!(`.exampleuda.countBy;`.exampleuda.countByQuery;metadata);
```

Add the aggregation function

Add an agg.q file in the src folder to hold the aggregation function definition.

The package folder structure now looks like this:

└──insights-demo
    ├── init.q
    ├── manifest.yaml
    └── src
        ├── da.q
        └── agg.q

Add the following code to the agg.q file:

// @desc Counts number of entries by specified columns.
// @param table     {symbol}            Table name.
// @param startTS   {timestamp}         Start time (inclusive).
// @param endTS     {timestamp}         End time (exclusive).
// @param byCols    {symbol|symbol[]}   Column(s) to count by.
// @return          {dict|table}        The `byCols` dictionary and table payload. 
.exampleuda.countByQuery:{[table;startTS;endTS;byCols]
    bc:bc!bc:(),byCols; / Convert byCols into dictionary for select table
    agg:enlist[`x]!enlist(count;`i); / Create aggregation to get count of records
    res:.kxi.selectTable`table`startTS`endTS`groupBy`agg!(table;startTS;endTS;bc;agg); / Select data
    .kxi.response.ok(byCols;res) / Return both the table result and the by clause for easy aggregation
    }

// Define metadata.
metadata:.kxi.metaDescription["Custom UDA - does a count by."],
    .kxi.metaParam[`name`type`isReq`description!(`table;-11h;1b;"Table name.")],
    .kxi.metaParam[`name`type`isReq`description!(`byCols;11 -11h;1b;"Column(s) to count by.")],
    .kxi.metaParam[`name`type`isReq`description!(`startTS;-12h;1b;"Start time (inclusive).")],
    .kxi.metaParam[`name`type`isReq`description`default!(`endTS;-12h;0b;"End time (exclusive).";.z.P)],
    .kxi.metaReturn`type`description!(98h;"Count by specified columns.");

.kxi.registerUDA `name`query`aggregation`metadata!(`.exampleuda.countBy;`.exampleuda.countByQuery;`.exampleuda.countByAgg;metadata);

Points of interest

The metadata declaration and registerAPI function must be included in both files as they ensure the process registers its function with the corresponding Resource Coordinator for use by the system.
Having the query and aggregation functions in different files ensures each process only loads the function is needs to register.

Refer to UDA registration for more details.

Update the package

Update the package to load the UDA, by including entrypoints:

To ensure that the UDA is loaded by the Data Access and Aggregator processes specify entrypoints labeled data-access and aggregator in the package containing the UDA using the kxi package add command.
```
kxi package add --to insights-demo entrypoint --name aggregator --path src/uda.q
kxi package add --to insights-demo entrypoint --name data-access --path src/uda.q
```
Once the commands are executed the following entrypoints section is included in the package manifest file:

This updates the entrypoints section in the manifest.yaml file of the package as follows:
```
entrypoints:
    default: init.q
    data-access: src/da.q
    aggregator: src/agg.q
```

Note

The entrypoints only reference the files that include code a specific process needs to load.

4. Test on a staging environment

Deploy the package on a staging environment to ensure you the UDA performs as expected.

Ensure the package is loaded

To complete the deployment of the UDA, you must load it into the processes that utilize it. This is done by setting environment variables.

Set the necessary environment variables for each component to locate and load the package:

Each component loading custom code from a package must have the KXI_PACKAGES and KX_PACKAGE_PATH environment variables set.

env:
  - name: KXI_PACKAGES
    value: "uda-package"
  - name: KX_PACKAGE_PATH
    value: "/opt/kx/packages"

Mount the package

Mount the package as a volume to the folder specified in KX_PACKAGE_PATH.

DockerDocker ComposeKubernetes

Use -v to supply a volume:

docker run -e KX_PACKAGE_PATH=/opt/kx/packages\
  -e KXI_PACKAGES="uda-package"\
  -v /path/to/package:/opt/kx/packages

Set volumes and environment:

services:
  rdb:
    image: dap
    command: -p 5000
    environment:
      - KXI_PACKAGES=uda-package:1.0.0
      - KX_PACKAGE_PATH=/opt/kx/packages/
    volumes:
      - /path/to/package:/opt/kx/packages

Mount a volume under the container:

hostPath is used an example. This may be a persistent volume of any type.

spec:
  spec:
    containers:
      - name: dap
        image: dap
        env:
          - name: KXI_PACKAGES
            value: "uda-package:1.0.0"
          - name: KX_PACKAGE_PATH
            value: "/opt/kx/packages"
        volumeMounts:
        - mountPath: /opt/kx/packages
          name: uda-package-mount
    volumes:
      - name: uda-package-mount
        hostPath:
          path: /opt/kx/packages

Load the UDA

You can load a UDA without restarting core components like Data Access Processes (DAPs) and Aggregators (AGGs). This allows for faster development and testing.

To load a UDA, follow the instructions below:

Expose the DAPs and Aggregator ports in compose.yaml. For example:
```
  kxi-da:
    ports:
      - 5081:5081
      - 5082:5082
      - 5083:5083
  kxi-agg:
    ports:
      - 5060:5060
```
Ensure these ports are exposed to the client through port-forwarding before running these curl commands

Define the curl endpoints. For example:

export PKG=uda-package
export DAP=data-access
export AGG=aggregator
curl -X POST "http://localhost:5081/packages/post/load?package=$PKG&entry=$DAP"
curl -X POST "http://localhost:5082/packages/post/load?package=$PKG&entry=$DAP"
curl -X POST "http://localhost:5083/packages/post/load?package=$PKG&entry=$DAP"
curl -X POST "http://localhost:5060/packages/post/load?package=$PKG&entry=$AGG"

Confirm the UDA is available using one of the methods below:

It may take several minutes for the updates to become available.

Call the following command:
```
curl -X POST --header "Content-Type: application/json"\
    --header "Accepted: application/struct-text"\
    "https://${INSIGHTS_HOSTNAME}/servicegateway/kxi/getMeta"
```
You should see exampleuda.countBy in the data returned.
If the UDA is not available after several minutes
- If the deployment fails, review error messages and logs to identify the cause. Common issues include missing dependencies, incorrect configurations, or permission errors.
- If necessary, roll back the deployment to the last known stable state to avoid disruptions in production.
- Identify and troubleshoot issues, correct any errors, and attempt the deployment again. Refer to the Troubleshooting guide for more details.

Once the UDA available, query it to test that it produces the expected results:

Choose the appropriate parameter values based on what you wish to return

startTS=$(date -u '+%Y.%m.%dD%H:00:00')
endTS=$(date -u '+%Y.%m.%dD%H:%M%:%S')
table="table1"
columns="column1,column2"
curl -X POST --header "Content-Type: application/json"\
    --header "Accepted: application/struct-text"\
    --data "{\"startTS\": \"$startTS\", \"endTS\": \"$endTS\", \"table\": \"$table\", \"byCols\": \"$columns\"}"\
    "https://${INSIGHTS_HOSTNAME}/servicegateway/exampleuda/countBy"

Refer to REST for more details from using the UDAs through REST.

Refer to the example UDAs documentation for more RESTful UDA query examples.

If there are any issues with calling the UDA refer to the Troubleshooting guide for more details.

5. Deploy to production

Once you are happy with the UDA you can deploy the updated package to production as above.

Next steps

Refer to UDA configuration for more details on UDAs.
Refer to UDA examples for more examples of User Defined Analytics.