!
Creating User Defined Analytics (UDAs)
User Defined Analytics (UDAs) can be used to define new APIs callable through the Service Gateway (SG). These analytics can be loaded into the Data Access Processes (DAPs) and the Aggregators using folders referred to as a package
. Use the kdb Insights Package CLI documentation to initialize a new package
.
Use these UDAs to augment the existing APIs of the kdb Insights system with application logic specific to your business needs. The following documentation provides instructions on organizing UDAs in files, registering UDAs to make them callable using the kdb Insights Service Gateway, and debugging potential issues.
A UDA consists of two functions, a query
function and an aggregation
function.
Query
Query function overview
The query
function runs on DAPs to retrieve data. This data may be distributed across multiple DAPs based on temporal or lexical purviews. Optionally, the query
function can modify the retrieved data before sending the results to the Aggregator process.
Result handling
The result of the query
function is referred to as a partial
. In cases where a UDA spans multiple DAPs, the result from each DAP is a partial result of the total output returned to the caller of the UDA. For example, you can execute a query that retrieves data from each DAP and only returns the first 100 rows.
Arguments
You can define the arguments for the query
function as:
- A list of values
- A dictionary of keys named args
Response wrapping
To indicate that the query
executed successfully, wrap the response to the UDA in the helper function .kxi.response.ok
.
Example - Arguments as values
```q
.example.queryFn:{[table;columns]
columns:$[-11h = type columns;enlist columns;columns];
filter:enlist (<;`i;100); // Note for partitioned tables, will return first 100 per date
.kxi.response.ok ?[table;filter;0b;columns!columns]
};
```
Example 2 - Arguments as a dictionary
```q
.example.queryFn:{[args]
columns:$[-11h = type columns:args`columns;enlist columns;columns];
filter:enlist (<;`i;100); // Note for partitioned tables, will return first 100 per date
.kxi.response.ok ?[args`table;filter;0b;columns!columns]
};
```
Aggregation
Aggregation function overview
The aggregation
function combines partials
received from the DAPs to form the final result, which is then returned to you.
Default aggregation
By default, if no specific aggregation function is defined, the aggregation
function uses the kdb+ raze
operator to combine the partials
.
Custom aggregation
The aggregation
function can be customized based on the specific needs of the UDA. This customization allows the function to perform additional or reduced processing, depending on the requirements.
Arguments
The aggregator
argument is a list of partials
from the DAPs. The datatype of this argument depends on the output returned from the query
function within the UDA.
Example - Aggregation function implementation
The following example demonstrates how to define an aggregation function that returns 100 random records from the list of partials
returned by the DAPs. The variable res
represents the list of partials
.
```q
.example.aggFn:{[res]
.kxi.response.ok 100?raze res
};
```
UDA Registration using .kxi.registerUDA
Before calling a UDA from the Service Gateway, you must register it with the Service Gateway. This registration is done in the code loaded into the DAPs or Aggs using the .kxi.registerUDA
function.
Parameters
The .kxi.registerUDA
function accepts a dictionary with the following keys:
| key name | required | type | description |
|--------------|----------|---------|-------------------------------------------------------------------------------------------------------------------------------------------|
| name
| yes | symbol | The name of the UDA when called using the Service Gateway. |
| query
| yes | symbol | The name of function that runs on the DAPs and retrieves the partials
for aggregation. |
| aggregation
| yes | symbol | The name of function that runs on the aggregator and combines the partial results from the DAPs. |
| metadata
| no | list | Metadata about the properties of the UDA. It's recommended to use the metadata builders described in the Metadata builders section. |
Metadata builders
The .kxi
namespace includes a set of meta-building APIs for defining metadata. The following APIs simplify the process of defining metadata entries for UDAs and their functions. For examples of how to use these APIs, refer to the Examples section.
Meta-Building APIs
.kxi.metaDescription
Creates a description entry for a UDA's metadata.
- Parameters:
descr
: string - The description of the UDA.
.kxi.metaParam
Creates a parameter entry for a UDA's metadata. These parameters should match the inputs of the query
function.
- Parameters:
param
: dictionary - Dictionary containing any subset of the following keys:name
: symbol - The name of the parameter.type
: short|short[] - The possible types for the parameter.isReq
: Boolean - Indicates whether the parameter is required.default
: any - The default value of the parameter if it is not required.description
: string - Plain text description of the parameter.
.kxi.metaReturn
Creates a return entry for a UDA's metadata. This entry should match the return type of the aggregation
function.
- Parameters:
return
: dictionary - Dictionary containing any subset of the following keys:type
: short|short[] - The possible types for the return value.description
: string - Plain text description of the return value.
.kxi.metaMisc
Creates a miscellaneous metadata entry.
- Parameters:
misc
: dictionary - Dictionary containing any subset of the supported miscellaneous fields:safe
: Boolean - Indicates whether the UDA can be safely retried in the event of a failure.
Putting it all together
This section combines the previous examples into a single file that can be loaded into the DAPs and Aggregators to run a custom UDA. It is recommended to have the function definitions and the registration in the same file to ensure consistent definitions across DAPs and Aggregators. In the following example, the table readings
contains information about sensor usage and the Postal Code of the sensor.
Example schema:
q)readings
sensorID readTS val postal
-------------------------------------------------
0 2024.08.12D17:32:37.645609220 100 P0P1K0
1 2024.08.12D17:32:37.645609220 200 K7K2C7
q)meta readings
c | t f a
--------| -----
sensorID| j
readTS | p
val | f
postal | s
Using a table with the schema above, here's an example UDA that returns the statistics about readings usage per postal code area:
//
// @desc Computes the count of records as well as the sum of `val` column by first 3 digits of postal code. .
//
// @param startTS {timestamp} Starting timestamp.
// @param endTS {timestamp} Starting timestamp.
// @paream filter {list} Filter to apply to data.
//
// @return {table} Average count by sym
//
.example.usageQuery:{[startTS;endTS;filter]
args:`table`startTS`endTS`columns`filter`groupBy!(`readings;startTS;endTS;(`sumVal`sum`val;`cntVal`cnt`val);filter;`postal); / Build arguments
res:.kxi.selectTable args; / Make selectTable call
res:0!select sumVal:sum sumVal,cntVal:sum cntVal by postal:`$3#/:string postal from res; / Anonymize data, and do further crunchdown
.kxi.response.ok res / Return with OK
};
//
// @desc Agg function that consolidates partial results from daps and outputs the statistics total usage, the count of records, and the average per postal code.
//
// @param tbls {table[]} List of tables with postal, sumVal, cntVal columns.
//
// @return {table} Table with sum, count, average by first 3 digits of postal code.
//
.example.usageAgg:{[tbls]
.kxi.response.ok 0!update avgVal:sumVal%cntVal from select sumVal:sum sumVal, cntVal:sum cntVal by postal from raze tbls
};
metadata:.kxi.metaDescription["Returns the average by sym across first 100 records of each DAP"], / Description of the UDA
.kxi.metaParam[`name`type`isReq`description!(`startTS;-12h;1b;"Start timestamp of query")], / Describe `startTS` parameter
.kxi.metaParam[`name`type`isReq`description!(`endTS;-12h;1b;"End timestamp of query")], / Describe `startTS` parameter
.kxi.metaParam[`name`type`isReq`default`description!(`filter;0h;0b;();"Filter to apply to readings table")], / Describe `filter` parameter
.kxi.metaReturn`type`description!(98h;"The summary of total and average usage by first 3 digits of postal code"), / Describe the return of the UDA
.kxi.metaMisc[enlist[`safe]!enlist 1b]; / States that this UDA is safe to be retried in the event of an error
.kxi.registerUDA `name`query`aggregation`metadata!(`.example.usageByPostalCode;`.example.usageQuery;`.example.usageAgg;metadata) / Register
!!! note .kxi.response.ok
A custom aggregation API must return a response of (rcHeader; payload)
. Use .kxi.response.ok
to return this response shape.
Troubleshooting registration
When registering a UDA, you may encounter various errors. The following sections describe some common errors and how to debug or fix them.
Missing name
argument in UDA registration
This error occurs when the name
key of the .kxi.registerUDA
function is not specified. When this happens, the details dictionary supplied to the function is output to the log, helping you determine which UDA is failing.
For example:
q)args:`query`aggregation`metadata!(`myQuery;`myAggregation;()!()) / Build registration arguments, with `name` missing
q).kxi.registerUDA args
Outputs the following error message:
Missing name argument in UDA registration, det=[query=`myQuery;aggregation=`myAgg;metadata=()!()]
Name
argument in UDA registration is not a symbol
This error occurs when the name
is specified in the registration dictionary but has an incorrect datatype. Only symbols are supported. For example:
q)args:`name`query`aggregation`metadata!("some name";myQuery;`myAggregation;()!())
q).kxi.registerUDA args
Outputs the following error message:
Name argument in UDA registration is not a symbol, type=10 det=[name="some name";query=`myQuery;aggregation=`myAgg;metadata=()!()]
The error message outputs all the details passed to the .kxi.registerUDA
function, helping you identify where in your code the issue is occurring. To fix this issue, ensure the name
is of type symbol. Using the example above, the correct arguments would be:
q)args:`name`query`aggregation`metadata!(`$"some name";myQuery;`myAggregation;()!())
Query/Aggregation function not loaded into the process
This error can occur in two variations: the query variation and the aggregation variation. The error occurs when either the query or aggregation function being called using .kxi.registerUDA
is not loaded or defined in the current process.
The error message displayed starts with either "Query" or "Aggregation," indicating which function is not defined. To resolve this issue, ensure that the function is loaded into the process before calling .kxi.registerUDA
.
Troubleshooting UDAs
The following section lists some helpful tips and ways to debug issues that may occur when writing either the query or aggregation functions as part of the creation of a User Defined Analytic.
!! Info: Information on general kdb+ errors can be found here.
Testing Query functions
When creating a query, you may encounter bugs that require investigation and debugging. The easiest way to debug an issue is to attach to a DAP and run the code directly, with error trapping enabled. For more information on error trapping, refer to the error trap clients section.
If you run the query through the framework, you might encounter a situation where the query function encounters an error, but the framework catches and moves past the error. Consider the following example:
Example query debugging
Consider the following query function, which causes a type error:
.custom.countBy:{[table;startTS;endTS;byCols]
1+`
}
Although the error is evident in the code, use this function for demonstration purposes only. After calling this API, the DAP logs provide the following output:
2024-08-12 15:45:30.146 [hdb] DEBUG KXCTX {23329698-9642-4654-b65e-81bfefe16b05} Starting context, corr=23329698-9642-4654-b65e-81bfefe16b05
2024-08-12 15:45:30.146 [hdb] DEBUG KXCTX {23329698-9642-4654-b65e-81bfefe16b05} Setting context to no-save, reason='requests not saved in DAP'
2024-08-12 15:45:30.146 [hdb] DEBUG DA {23329698-9642-4654-b65e-81bfefe16b05} Executing .custom.countBy
2024-08-12 15:45:30.146 [hdb] ERROR SAPI {23329698-9642-4654-b65e-81bfefe16b05} Error (type) encountered executing .custom.countBy, rc=6 ac=11 ai=Unexpected error (type) encountered executing .custom.countBy
2024-08-12 15:45:30.146 [hdb] DEBUG DA {23329698-9642-4654-b65e-81bfefe16b05} Completed .custom.countBy, rc=6 ac=11 ai=Unexpected error (type) encountered executing .custom.countBy
2024-08-12 15:45:30.146 [hdb] DEBUG DA {23329698-9642-4654-b65e-81bfefe16b05} Sending response to aggregator, agg=:10.244.0.9:5070
2024-08-12 15:45:30.147 [hdb] DEBUG KXCTX {23329698-9642-4654-b65e-81bfefe16b05} Ending context, no-save
The log indicates a type
error, which helps identify that the issue is related to data types. To pinpoint the exact location of the failure, you can enhance debugging by adding log statements to the code. Alternatively, if you attach to the process, kdb can show where the process is failing.
To attach to the process and enable more detailed debugging, run the following commands:
q).kxi.pe.disable[] / This turns off protected evaluation
q)\e 1 / Set kdb to catch errors
After running the same query while connected to the process, the following detailed debugging output is displayed:
q)2024-08-12 15:51:51.154 [hdb] DEBUG KXCTX {ba62e0b9-f290-4abd-be3d-dcc91cb6116d} Starting context, corr=ba62e0b9-f290-4abd-be3d-dcc91cb6116d
2024-08-12 15:51:51.154 [hdb] DEBUG KXCTX {ba62e0b9-f290-4abd-be3d-dcc91cb6116d} Setting context to no-save, reason='requests not saved in DAP'
2024-08-12 15:51:51.154 [hdb] DEBUG DA {ba62e0b9-f290-4abd-be3d-dcc91cb6116d} Executing .custom.countBy
type error
.custom.countBy[0] {[table;startTS;endTS;byCols]1+`}
^
q))table / See what current table argument is
`trade
q))startTS / See what current startTS argument is
-0Wp
Based on the output you can see specifically where the function is failing. Additionally, you can inspect the current values of the arguments. For larger functions, you can also run through the code line by line to identify the underlying issue.
Testing Aggregation functions
When creating a UDA, one may need to create an aggregation function to consolidate the partial results from multiple DAPs. If there is a bug in the aggregation logic, it can be tricky to determine what the underlying issue is if an aggregation fails. To assist in this debugging, there are options to have the aggregator return the partial results of a failed aggregation back to the caller so they can debug the issue within their own session. When the partial results are returned there will be an application code (ac
) in the header describing the type of aggregation failure, as well as an rc
of 100h
(PARTIALS). If the partialsSent
field is present and true, then the payload is just the partial results from each DAP involved in the request. By default the partial results are not sent on a failed aggregation. There are two ways to get the aggregator to send back partial results on a failed request.
- Set the
sendPartials
value in the header of the request.
To do this, one can add the sendPartials
field to their request to the gateway in the options.
Example
q)GATEWAY:10i / Handle to gateway
q)args:enlist[`table]!enlist `trade
q)GATEWAY (`.kxi.getData; args;`ignored;enlist[`sendPartials]!enlist 1b)
- Configure the Aggregator to always send partial results on a failed request.
If the environment variable KXI_SEND_PARTIALS
is set to true
on the aggregator, then every request that fails at the aggregation stage will send back the partial results in the payload.
Response
If sendPartials
is enabled (either via KXI_SEND_PARTIALS
or in the opts
of a request), and the request errors in the Aggregator. The response will be in one of the following formats:
Return Code | Payload Type | Payload Description |
---|---|---|
PARTIALS (100) |
list |
List of partial responses from each participating DAP. |
PARTIALS_SUB (101) |
table |
Table consisting the header , payload of each failed sub-request (see here). |
PARTIALS_RESUME (102) |
dictionary |
Dictionary consisting of partials , context , callback (see here). |
Return Code = PARTIALS
If the request fails at the aggregation step, the payload is the list of partials returned from each participating DAP, i.e. the inputs to the aggregation function that caused the failure.
Example
q)`rc`ac`ai#response 0
rc| 100h
ac| 30h
ai| "Unexpected error (mismatch) encountered aggregating myAPI"
q)response 1
+(,`x)!,1 2
+(,`y)!,3 4
Return Code = PARTIALS_SUB
If one or more sub-requests fails, the payload is a table consisting of the response header and payload of each failed failed sub-request.