Skip to content

Query Data

This section covers how to perform queries against tables in the KDB.AI database.

Selecting the table to query

Each table in KDB.AI has an associated name. To perform a query, specify the name of the table in which the relevant data is stored. Using Python client you can create a table object from the session. The REST client is more direct with the table name supplied as a field in the JSON payload.

documents = session.table("documents")

Queries

Obtain the data from the entire table using the following command:

documents.query()
curl -s -H "Content-Type: application/json" localhost:8082/data -d '{"table":"documents"}'

Filters

To select a subset of the data, you can apply filters as documented here. These serve as where clauses.

    import datetime
    start_time=datetime.datetime(2024,1,10)
    end_time=datetime.datetime(2024,10,10)
    results = table.query(filter=[("within","createdDate",[start_time, end_time]),("<=", "length", 100)])
    print(f'Results: {results}')
curl -s -H "Content-Type: application/json" localhost:8082/data \
-d '{"table":"documents","filter":[["within","createdDate",["2020.07.10D15:00:00.0000", "2021.07.10D15:00:00.0000"]],["<=", "length", 100]]}'

Processing results

You can return a subset of the columns in the table, reducing the amount of data sent back to the client.

documents.query(aggs=["author","content"])
curl -s -H "Content-Type: application/json" localhost:8082/data \
-d '{"table":"documents",agg=["author","content"]}'

In addition to returning a subset of the columns, you can return aggregated results, group by categorical variables, and sort based on a column name.

    results = table.query(aggs=[('sumLength','sum','length')], group_by=['author'], sort_by=['sumLength'])
    print(f'Aggregate data results: {results}')
curl -s -H "Content-Type: application/json" localhost:8082/data \
-d '{"table":"documents", 
agg=[["sumLength","sum","length"]],
groupBy=["author"],sortCols=["sumLength"]}'

Supported aggregations

The table below lists all supported aggregation functions.

Function Description
all Returns the logical 'and' of all values in a set.
any Returns the logical 'or' of all values in a set.
avg Calculates the mean value across the set of matching records.
count Returns the number of records in the current selection.
dev Calculates the standard deviation of a column.
distinct Returns the distinct values from a column.
first Returns the first occurrence of a value. This is useful when performing a group by aggregation.
last Returns the last occurrence of a value. This is useful when performing a group by aggregation.
max Takes the maximum value of a set of records.
min Takes the minimum value of a set of records.
prd Calculates the product of matching records.
sdev Calculates the sample deviation of matching records.
scov Calculates the sample covariance between matching records.
sum Calculates the sum of matching records.
svar Calculates the sample variance of matching records.
var Calculates the variance of matching records.