Skip to content

KDB.AI Python API

This page contains the references for KDB.AI's Python API. For example usage, see the Quickstart Guide.

Note

Before you start, ensure you have the following installed on your machine:

Note: Index Options

The argument index_option of function search() is the index specific options for similarity search. For example, efSearch can be specified for HNSW indexes, while clusters can be specified for IVF/IVFPQ indexes.

For details of the usage of index_option, see the How to use an Index in KDB.AI page.

Session

Session represents a connection to a KDB.AI instance. To interact with KDBAI Cloud or Server, you first need to create a session. This section summarizes how to create and close a session.

Create session

kdbai_client.Session

Session represents a connection to a KDB.AI instance.

Input parameters:

Name Type Description Required Default
api_key str API Key to be used for authentication. No None
endpoint str Server endpoint to connect to. No 'http://localhost:8081'
host str Hostname of the KDB.AI server. No None
port int Port number on the server. No - 8081 if mode='rest'
- 8082 if mode='qipc'
mode str Implementation method used for the session. Possible values: rest and qipc No None

Important

  1. If you don't provide the mode parameter:

    • A REST-based session is created if the endpoint starts with https://cloud.kdb.ai.
    • Otherwise, a qIPC-based session is created.
  2. Note that the REST-based implementation:

    • has worse performance due to payload serialization and deserialization.
    • has a 10MB limit on payload size for the train and insert methods.

Example:

import kdbai_client as kdbai
### local server
session = kdbai.Session(endpoint='http://localhost:8082')
session = kdbai.Session(endpoint='http://localhost:8082', mode='qipc')
### local server using REST
session = kdbai.Session(endpoint='http://localhost:8081', mode='rest')
### local server using TLS
session = kdbai.Session(endpoint='http://localhost:8082', options={'tls': True})
### cloud instance
session = kdbai.Session(api_key="abc" endpoint="https://...", mode="rest")

Error handling:

Description Message Troubleshooting
Success: Session is created and KDB.AI instance can be interacted with. True N/A
Fail: Incorrect API Key is provided when attempting to connect to a KDB.AI Cloud. KDBAIException with appropriate error message:
- qIPC: Error during creating connection, make sure KDB.AI server is running and accepts qIPC connection on port {port}: {e}“ where e is the original underlying error.
- REST: Failed to open a session on {self.endpoint} using API key with prefix {tmp}. Please double check your endpoint and api_key.
Check endpoint (host/port), credentials, and mode parameter. Check port forwarding in your environment and what port rules are allowed/denied.
Fail: No API Key is provided when attempting to connect to a KDB.AI Cloud. KDBAIException with appropriate error message: qIPC: “Error during creating connection, make sure KDB.AI server is running and accepts qIPC connection on port {port}: {e}“ where e is the original underlying error. REST: Failed to open a session on {self.endpoint} using API key with prefix {tmp}. Please double check your endpoint and api_key. Check endpoint (host/port), credentials, and mode parameter. Check port forwarding in your environment and what port rules are allowed/denied.
Fail: Server and client versions are incompatible. Your KDB.AI server is not compatible with this client (kdbai_client=={version}). Use kdbai_client >={versions['clientMinVersion']} and <={versions['clientMaxVersion']} Upgrade/downgrade either Server or client.
Error: Session cannot be created because KDB.AI is not available. RuntimeError('Error during request, make sure KDB.AI server running') Check your connection and if your server is running.

Close session

session.close()

You cannot execute any client-server interaction after this call.

Example:

session.close()

Error handling:

Description Message Troubleshooting
Success: Session is closed and KDB.AI instance can no longer be interacted with. True N/A

Get version

session.version()

Retrieve version info from server and compatible client min/max version.

Example:

session.version()

Error handling:

Description Message Troubleshooting
Success. version info is returned. {'serverVersion': '1.4.0','clientMinVersion': '1.4.0' ,'clientMaxVersion': 'latest'} N/A
Error: KDBAI is not available. RuntimeError('Error during request, make sure KDB.AI server running') Check your connection and if your server is running.

Database

Create, delete, and retrieve databases.

In KDB.AI, a database is a collection of tables which store related data.

Key principles for database management

To simplify database design/management and prevent naming conflicts, follow the principles below:

  • Unique database names: Each database must have a unique name and can contain multiple tables.
  • Unique table names within a database: Tables within a database must have unique names, but different databases can contain tables with the same name. This is similar to the concept of namespaces.
  • Cascade deletion: When deleting a database, all child entities (tables) will also be deleted.
  • Default database: You don't need to create a database to create tables. If you create a table without specifying a database, it will be placed in a default, undeletable database.

Create database

session.create_database

Input parameters:

Name Type Description Required Default
database str Name of the database to create. Yes None

Database name rules

  • Max length is 128 characters
  • Must contain only alphanumeric characters and underscore
  • Must start with an alpha character

Example:

session.create_database("myDatabase")

Error handling:

Description Message Troubleshooting
Success: Database is created and returned database instance N/A
Fail: Database name is not unique Raise exception A database with the given name already exists. Create a database with another name.
Fail: Database name is not a valid name Raise exception Provide a valid str for the database name.
Error: KDBAI is not available RuntimeError('Error during request, make sure KDB.AI server running') Check your connection and if your server is running.

Get database

session.database

Retrieve database with a given name.

Input parameters:

Name Type Description Required
database str Name of the database to be retrieved Yes

Example:

session.database("myDatabase")

Error handling:

Description Message Troubleshooting
Success: Database with given name is found Database instance. N/A
Fail: Database with given name is not found KDB.AI Exception: database {name} does not exist Check the name of the database you are searching for as it does not seem to exist.
Error: KDBAI is not available RuntimeError('Error during request, make sure KDB.AI server running') Check your connection and if your server is running.

Refresh database

database.refresh()

This method ensures that the list of tables associated with the loaded database is current. If the list is not up-to-date, it updates it. This is particularly useful if tables have been added to the database after the getDatabase function was called.

Example:

database.refresh()

Error handling:

Description Message Troubleshooting
Success: Database is refreshed None N/A
Error: KDBAI is not available RuntimeError('Error during request, make sure KDB.AI server running') Check your connection and if your server is running.

List databases

session.databases

Retrieve list of databases in ascending order.

Example:

session.databases()

Error handling:

Description Message Troubleshooting
Success: Returns list of database names and default database included list of database names N/A
Error: Databases cannot be listed because KDBAI is not available RuntimeError('Error during request, make sure KDB.AI server running') Check your connection and if your server is running.

Delete database

database.drop

Delete database with a given name and all associated tables.

Input parameters:

Name Type Description Required
database str Name of the database to be deleted. Yes

Example:

db=session.database("myDatabase")
db.drop()

Error handling:

Description Message Troubleshooting
Success: Database with given name has been deleted N/A N/A
Error: KDBAI is not available RuntimeError('Error during request, make sure KDB.AI server running') Check your connection and if your server is running.

Table

Create, delete, update, and retrieve tables.

Create table

database.create_table

Input parameters:

Name Type Description Required
database instance name Name of the database. Yes
table str Name of the table to create. Yes
external_data_references dict Should contain the keys:
- path (path to the existing kdb+ table mounted in our Docker container)
- provider (set to kx)

WARNING: the name of the table should match the name of the target table in the existing kdb+ database.
No
schema dict Schema details for the table. Yes - if external_data_references is not specified.
indexes list of dict List of index definitions No
partitionColumn str Column name to partition on No
embeddingConfigurations dict Should be keyed by embedding column name No

Table name rules

  • Max length is 128 characters
  • Must contain only alphanumeric characters and underscore
  • Must start with an alpha character

Example:

schema = [{'name': 'id', 'type': 'int16'},
{'name': 'tag', 'type': 'bool'},
{'name': 'author', 'type': 'str'},
{'name': 'length', 'type': 'int32'},
{'name': 'content', 'type': 'str'},
{'name': 'createdDate', 'type': 'datetime64[D]'},
{'name': 'embeddings', 'type': 'float64s'}]
indexes = [
{'type': 'flat', 'name': 'flat', 'column': 'embeddings',  'params': {'dims': 1536}},
{'type': 'hnsw', 'name': 'fast_hnsw', 'column': 'embeddings', 'params': {'dims': 1536,'M': 8, 'efConstruction': 8}},
{'type': 'hnsw', 'name': 'accurate_hnsw','column': 'embeddings', 'params': {'dims': 1536,'M': 64, 'efConstruction':256}} 
]
db = session.database("default")
db.create_table(table="myTable", schema=schema, indexes=indexes)
# create partitioned table
db.create_table(table="myPartitionedTable", schema=schema, indexes=indexes, partition_column='createdDate')

schema

Attributes:

Name Type Description Required
name str Column name Yes
type str Column type Yes

Example:

schema = [ { 'name': 'id', 'type': 'int32'}, { 'name': 'isValid', 'type': 'bool'},
{ 'name': 'embeddings', 'type': 'float32s' }, { 'name': 'sparse_col', 'type': 'general' } ]

indexes

Attributes:

Name Type Description Required
name str Index name Yes
type str Index type, for example: flat, qFlat, hsnw, ivf, ivfpq, qhsnw Yes
column str kdb+ column name to apply index Yes
params dict Index parameters containing index-specific attributes for Flat, qFlat, HNSW, ivf, ivfpq, qHNSW Yes

Example:

indexes = [
{'type': 'flat', 'name': 'flat', 'column': 'embeddings',  'params': {'dims': 1536}},
{'type': 'hnsw', 'name': 'fast_hnsw', 'column': 'embeddings', 'params': {'dims': 1536, 'M': 8, 'efConstruction': 8}},
{'type': 'hnsw', 'name': 'accurate_hnsw','column': 'embeddings', 'params': {'dims': 1536, 'M': 64, 'efConstruction':256}} 
]
flat

Index-specific attributes (params) for type = flat

Attribute Description Type Required Default
dims Dimension of vector space int Yes N/A
metric Distance metric str No L2
qFlat

Index-specific attributes (params) for type = qFlat

Attribute Description Type Required Default
dims Dimension of vector space int Yes N/A
metric Distance metric str No L2
hnsw

Index-specific attributes (params) for type = hnsw

Attribute Description Type Required Default
dims Dimension of vector space int Yes N/A
M Graph valency int No 8
efConstruction Search depth at construction int No 8
metric Distance metric str No L2
qHnsw

Index-specific attributes (params) for type = qHnsw

Attribute Description Type Required Default
dims Dimension of vector space int Yes N/A
M Graph valency int No 8
efConstruction Search depth at construction int No 8
metric Distance metric str No L2
mmapLevel Level of memory mapping. Accepted values:
- 0 for both vectors and node connection in memory;
- 1 for memory-mapped vectors and in-memory nodes ;
- 2 for both vectors and node connections memory mapped.
int No 1

An index consists of vectors and nodes. Vectors represent the data points in the vector space, while nodes are part of the graph structure used to organize and search through these vectors efficiently. Nodes connect vectors based on their similarity, forming a graph that facilitates fast nearest-neighbor searches.

ivf

Index-specific attributes (params) for type = ivf

Attribute Description Type Required Default
nclusters Number of clusters long No 8
metric Distance metric str No L2
ivfpq

Index-specific attributes (params) for type = ivfpq

Attribute Description Type Required Default
nclusters Number of clusters long No 8
nbits Number of bits to quantize long No 8
nsplits Number of vectors to split long No 8
metric Distance metric str No L2

external_data_references

Attributes:

Name Type Description Required
path byte str Path to external table, for instance the existing kdb+ table mounted in our Docker container. Yes
provider str Provider of external table, for example kx. Yes

Example:

Launch the KDB.AI Server container with the -v flag to mount an existing kdb+ DB in the container, for example:

docker run -it --e NUM_WRK=1                        \
                -e SECONDARIES=16                   \
                -e KDB_LICENSE_B64                  \
                -v $PWD/vecdb/data:/tmp/kx/data/vdb \
                -v $PWD/taq/db:/tmp/kx/remote:ro    \   <= mount a local ./taq/db under /tmp/kx/remote in the container as read-only
                -p 8082:8082                        \
                kdbai-db:local
Then:

database.create_table("tq", external_data_references=[{'path': b'/tmp/kx/remote', 'provider': 'kx'}])

The name of the table (tq) should match the name of the target table in the existing kdb+ db.

Error handling:

Description Message Troubleshooting
Success: Table is created and returned successresult`error!True;table_dictionary;" N/A
Fail: Table name is not unique Raise exception Specify a different table name as it appears a table with this name already exists.
Fail: Table name is not valid Raise exception Use a valid string for the table name.
Fail: Any of the input parameters are of wrong type ValueError: "invalid arguments types: " ... Provide the correct type of input parameters required.
Fail: Any of the input parameters are missing ValueError: "missing arguments: " ... Provide required input parameters.
Fail: Any of the input parameters are invalid ValueError: "invalid arguments: " ... Provide known or valid input parameters.
Fail: Schema individual attributes are not valid ValueError: "invalid table attributes: " ... Provide valid attributes in the schema.
Fail: Schema individual types are not valid ValueError: "invalid column types: " ... Provide valid column types in the schema.
Fail: Index individual parameters are not valid ValueError: "invalid index parameters: " ... Double check the parameters of one of the specified indexes.
Error: KDBAI is not available RuntimeError('Error during request, make sure KDB.AI server running') Check your connection and if your server is running.

Get Table

database.table

Retrieve a table from a database with a given name.

Example:

db=session.database("default")
db.table("myTable")

Error handling:

Description Message Troubleshooting
Success: Table with given name is found Table meta dictionary as Pandas DataFrame N/A
Error: KDBAI is not available RuntimeError('Error during request, make sure KDB.AI server running') Check your connection and if your server is running.

Refresh table

table.refresh()

This method ensures that the table index and schema information associated with the table is current and calls getTable function.

Example:

table.refresh()

Error handling:

Description Message Troubleshooting
Success: Table is refreshed None N/A
Error: KDBAI is not available RuntimeError('Error during request, make sure KDB.AI server running') Check your connection and if your server is running.

List tables

database.tables

Retrieve a list of tables from a database with a given name.

Tables are cached on the database instance. As a result, the data might have changed since the last get or refresh.

Example:

db = session.database("myDatabase")
db.tables

Error handling:

Description Message Troubleshooting
Success: Tables found List of table names N/A
Error: KDBAI is not available RuntimeError('Error during request, make sure KDB.AI server running') Check your connection and if your server is running.

Delete Table

table.drop()

Delete a table with a given name and all associated indexes.

Example:

db = session.database("default")
table = db.table("myTable")
table.drop()

Error handling:

Description Message Troubleshooting
Success: Table with given name has been deleted N/A N/A
Error: KDBAI is not available RuntimeError('Error during request, make sure KDB.AI server running') Check your connection and if your server is running.

Index

Retrieve and list indexes.

Get index

table.index

Retrieve an index from a table.

Input parameters:

Name Type Description Required
name str Name of the index to be retrieved Yes

Example:

table.index('trade_flat_index')

Error handling:

Description Message Troubleshooting
Success: Index with given name is found and returned dictionary N/A
Fail: Index name is not valid ValueError: Index name is invalid Provide a valid string for the index name.
Fail: Index with given name is not found ValueError: Index name is not found Provide correct index name.
Error: KDBAI is not available RuntimeError('Error during request, make sure KDB.AI server running') Check your connection and if your server is running.

List indexes

table.indexes

List all indexes for a table.

Example:

db = session.database("default")
table = db.table("myTable")
table.indexes

Error handling:

Description Message Troubleshooting
Success: Indexes found and returned list of dictionaries N/A
Error: KDBAI is not available Cannot write to handle ... Check your connection and if your server is running.

Update indexes

table.update_indexes

Build one or more indexes.

Allows to build indexes from scratch. Only supported for kdb+ HDB tables.

Input parameters:

Name Type Description Required
indexes list List of index names to build. Yes
parts list Partitions list to build index in case of partition database. If not given, then indexes will be built on all partitions. Yes

Example:

db = session.database("default")
table = db.table("SEC")
table.update_indexes(indexes=["flat_index"], parts=[1,2,3]) #assuming we have a partition column with integer type

Error handling:

Description Message Troubleshooting
Success: Index(es) with given name(s) updated successfully None N/A
Fail: Operation called on a table managed by kdbai KDBAIException: feature not supported: build index is only allowed on reference database Use build index only on reference tables.
Fail: Index name is not valid ValueError: Index name is invalid Provide a valid string for the index name
Fail: Index with given name is not found KDBAIException: index not found: invalid Provide correct index name.
Fail: Update operation is not valid ValueError: Update operation is not valid
Error: KDBAI is not available Cannot write to handle ... Check your connection and if your server is running.

Data

Insert, query, and search data.

Insert data

table.insert

Add rows to a table.

Input parameters:

Name Type Description Required
payload dataframe Data to insert. No - not required when using external database.

Example:

db = session.database("default")
table = db.table("myTable")
table.insert(data)

Error handling:

Description Message Troubleshooting
Success: Data inserted successfully. dictionary N/A
Fail: Data table does not match with table schema. KDBAIException: "data has wrong types: cols provided - expecting " Check data schema and expected table schema.
Error: KDBAI is not available RuntimeError('Error during request, make sure KDB.AI server running') Check your connection and if your server is running.

Train data

table.train

Train data.

Input parameters:

Name Type Description Required
payload table Data to insert. Yes

Example:

db = session.database("default")
table = db.table("myTable")
table.train(payload=data)

Error handling:

Description Message Troubleshooting
Success: Index(es) with given name(s) updated successfully True N/A
Fail: Index name is not valid ValueError: Index name is invalid Provide a valid string for the index name.
Fail: Index with given name is not found ValueError: Index name is not found Provide correct index name.
Error: KDBAI is not available RuntimeError('Error during request, make sure KDB.AI server running') Check your connection and if your server is running.

Query data

table.query

Query data from a table.

Input parameters:

Name Type Description Required
filter list of tuples List of filter conditions, parse tree style. No
sort_columns list of str The columns by which to sort the results. No
group_by list of str The column values by which to group the results. No
aggs dictionary Aggregation rules. Dictionary structure:
- Key → new column name
- Value → old column name or parse tree style aggregation rule
No
limit int Number of rows to return. No

Example:

db = session.database("default")
table = db.table("myTable")
table.query()  #returns all rows in the table

Error handling:

Description Message Troubleshooting
Success: Successful query Pandas DataFrame N/A
Error: KDBAI is not available RuntimeError('Error during request, make sure KDB.AI server running') Check your connection and if your server is running.

Search data

table.search

Perform a similarity search.

Input parameters:

Name Type Description Required
type str Specify the type of search (tss or otherwise). No
vectors dictionary Indexes to query with query vectors. Yes
n int Number of neighbors to return. No
range float Range within which the nearest neighbours are returned. (only for qFlat) No
index_params dictionary (key is index name and value is dictionary of parameters for that index) Weights required for multi index search. No
options dictionary Use this dictionary:
- to rename the distance column with distanceColumn=newname
- to not return metadata columns with indexOnly=True
- to return TSS matched patterns with returnMatches=True
- to force a TSS search on a partitioned tables with failing partitions with force=True
Yes
filter list of tuples List of filter conditions, parse tree style. No
searchBy str or list of str (Non Transformed TSS only) Perform a TSS search on each group inferred from the specified columns (not to be confused with groupBy which is used for final aggregation of the results) No
group_by list of str The column values by which to group the results. No
aggs dictionary Aggregation rules. No
sort_columns list of str The columns by which to sort the results. No

Example:

db = session.database("default")
table = db.table("myTable")
table.search(vectors={"indexName":v},n=10)

# Filter the data using 'range' (only for qFlat indexes)
table.search(vectors={"indexName":v}, range=5.5)

options

Attribute Description Type Required Default
distanceColumn Rename distance column to this. str No None
indexOnly Return only index information bool No None
returnMatches (Non Transformed TSS only) Return the full detected pattern for each match boolean No None
force (Non Transformed TSS only) Force the TSS search even some searchBy group or table partition is failing, ex: when a partition has less data points than the searched pattern boolean No None

index_params

index_params is a dictionary where key is index name and value is a dictionary with the arguments below .

Attribute Description Type Required Default
weight Weight for each index. float Required for multi index input. None

Important! For multi index searches, you have to allocate a weight to each index. The sum of all weights must be equal to 1.

Error handling:

Description Message Troubleshooting
Success: Successful query list of Pandas DataFrames N/A
Error: KDBAI is not available RuntimeError('Error during request, make sure KDB.AI server running') Check your connection and if your server is running.