KDB.AI Python API
This page contains the references for KDB.AI's Python API. For example usage, see the Quickstart Guide.
Note
Before you start, ensure you have the following installed on your machine:
- Python (versions 3.8 to 3.11)
- An active KDB.AI Cloud or Server license
kdbai-client
- PyKX dependencies.
Note: Index Options
The argument index_option
of function search()
is the index specific options for similarity search. For example, efSearch
can be specified for HNSW indexes, while clusters
can be specified for IVF/IVFPQ indexes.
For details of the usage of index_option
, see the How to use an Index in KDB.AI page.
Session
Session represents a connection to a KDB.AI instance. To interact with KDBAI Cloud or Server, you first need to create a session. This section summarises how to create and close a session.
Create session
kdbai_client.Session
Session represents a connection to a KDB.AI instance.
Input parameters:
Name | Type | Description | Required | Default |
---|---|---|---|---|
api_key | str | API Key to be used for authentication. | No | None |
endpoint | str | Server endpoint to connect to. | No | 'http://localhost:8082' |
host | str | Hostname of the KDB.AI server. | No | None |
port | int | Port number on the server. | No | None |
mode | str | Implementation method used for the session. Possible values: rest and qipc |
No | None |
Important
-
If you don't provide the
mode
parameter:- A REST session is created if the endpoint starts with https://cloud.kdb.ai.
- Otherwise, a QIPC session is created.
-
Note that the REST implementation:
- has worse performance due to payload serialization and deserialization.
- has a 10MB limit on payload size for the
train
andinsert
methods.
Example
import kdbai_client as kdbai
### local server
session = kdbai.Session(endpoint='http://localhost:8082')
session = kdbai.Session(endpoint='http://localhost:8082', mode='qipc')
### local server using REST
session = kdbai.Session(endpoint='http://localhost:8082', mode='rest')
### local server using TLS
session = kdbai.Session(endpoint='http://localhost:8082', options={'tls': True})
### cloud instance
session = kdbai.Session(api_key="abc" endpoint="https://...", mode="rest")
Error handling:
Description | Message | Troubleshooting |
---|---|---|
Success: Session is created and KDB.AI instance can be interacted with. | True | N/A |
Fail: Incorrect API Key is provided when attempting to connect to a KDB.AI Cloud. | KDBAIException with appropriate error message: - qIPC: Error during creating connection, make sure KDB.AI server is running and accepts QIPC connection on port {port}: {e}“ where e is the original underlying error. - REST: Failed to open a session on {self.endpoint} using API key with prefix {tmp}. Please double check your endpoint and api_key . |
Check endpoint (host/port), credentials, and mode parameter. Check port forwarding in your environment and what port rules are allowed/denied. |
Fail: No API Key is provided when attempting to connect to a KDB.AI Cloud. | KDBAIException with appropriate error message: qIPC: “Error during creating connection, make sure KDB.AI server is running and accepts QIPC connection on port {port}: {e}“ where e is the original underlying error. REST: Failed to open a session on {self.endpoint} using API key with prefix {tmp}. Please double check your endpoint and api_key . |
Check endpoint (host/port), credentials, and mode parameter. Check port forwarding in your environment and what port rules are allowed/denied. |
Fail: Server and client versions are incompatible. | Your KDB.AI server is not compatible with this client (kdbai_client=={version}). Use kdbai_client >={versions['clientMinVersion ']} and <={versions['clientMaxVersion ']} |
Upgrade/downgrade either Server or client. |
Error: Session cannot be created because KDB.AI is not available. | RuntimeError('Error during request, make sure KDB.AI server running') | Check your connection and if your server is running. |
Close session
session.close()
You cannot execute any client-server interaction after this call.
Example
session.close()
Error handling:
Description | Message | Troubleshooting |
---|---|---|
Success: Session is closed and KDB.AI instance can no longer be interacted with. | True | N/A |
Get version
session.version()
Retrieve version info from server and compatible client min/max version.
Example
session.version()
Error handling:
Description | Message | Troubleshooting |
---|---|---|
Success. version info is returned. | {'serverVersion': '1.4.0','clientMinVersion': '1.4.0' ,'clientMaxVersion': 'latest'} | N/A |
Error: KDBAI is not available. | RuntimeError('Error during request, make sure KDB.AI server running') | Check your connection and if your server is running. |
Database
Create, delete, and retrieve databases.
In KDB.AI, a database is a collection of tables which stores related data.
Key principles for database management
To simplify database design/management and prevent naming conflicts, follow the principles below:
- Unique database names: Each database must have a unique name and can contain multiple tables.
- Unique table names within a database: Tables within a database must have unique names, but different databases can contain tables with the same name. This is similar to the concept of namespaces.
- Cascade deletion: When deleting a database, all child entities (tables) will also be deleted.
- Default database: You don't need to create a database to create tables. If you create a table without specifying a database, it will be placed in a default, undeletable database.
Create database
session.create_database
Input parameters:
Name | Type | Description | Required | Default |
---|---|---|---|---|
database | str | Name of the database to create. | Yes | None |
Database name rules
- Max length is 128 characters
- Must contain only alphanumeric characters and underscore
- Must start with an alpha character
Example
session.create_database("myDatabase")
Error handling:
Description | Message | Troubleshooting |
---|---|---|
Success: Database is created and returned | database instance | N/A |
Fail: Database name is not unique | Raise exception | A database with the given name already exists. Create a database with another name. |
Fail: Database name is not a valid name | Raise exception | Provide a valid str for the database name. |
Error: KDBAI is not available | RuntimeError('Error during request, make sure KDB.AI server running') | Check your connection and if your server is running. |
Get database
session.database
Retrieve database with a given name.
Input parameters:
Name | Type | Description | Required |
---|---|---|---|
database | str | Name of the database to be retrieved | Yes |
Example
session.database("myDatabase")
Error handling:
Description | Message | Troubleshooting |
---|---|---|
Success: Database with given name is found | Database instance. | N/A |
Fail: Database with given name is not found | KDB.AI Exception: database {name} does not exist | Check the name of the database you are searching for as it does not seem to exist. |
Error: KDBAI is not available | RuntimeError('Error during request, make sure KDB.AI server running') | Check your connection and if your server is running. |
Refresh database
database.refresh()
This method ensures that the list of tables associated with the database is current and calls getDatabase
function.
Example
database.refresh()
Error handling:
Description | Message | Troubleshooting |
---|---|---|
Success: Database is refreshed | None | N/A |
Error: KDBAI is not available | RuntimeError('Error during request, make sure KDB.AI server running') | Check your connection and if your server is running. |
List databases
session.databases
Retrieve list of databases in ascending order.
Example
session.databases()
Error handling:
Description | Message | Troubleshooting |
---|---|---|
Success: Returns list of database names and default database included | list of database names | N/A |
Error: Databases cannot be listed because KDBAI is not available | RuntimeError('Error during request, make sure KDB.AI server running') | Check your connection and if your server is running. |
Delete database
database.drop
Delete database with a given name and all associated tables.
Input parameters:
Name | Type | Description | Required |
---|---|---|---|
database | str | Name of the database to be deleted. | Yes |
Example
db=session.database("myDatabase")
db.drop()
Error handling:
Description | Message | Troubleshooting |
---|---|---|
Success: Database with given name has been deleted | N/A | N/A |
Error: KDBAI is not available | RuntimeError('Error during request, make sure KDB.AI server running') | Check your connection and if your server is running. |
Table
Create, delete, update, and retrieve tables.
Create table
database.create_table
Input parameters:
Name | Type | Description | Required |
---|---|---|---|
database | instance name | Name of the database. | Yes |
table | str | Name of the table to create. | Yes |
external_data_references | dict | Should contain the keys: - path (path to the existing kdb+ table mounted in our Docker container) - provider (set to kx ) WARNING: the name of the table should match the name of the target table in the existing kdb+ database. |
No |
schema | dict | Schema details for the table. | Yes - if external_data_references is not specified. |
indexes | list of dict | List of index definitions | No |
embeddingConfigurations | dict | Should be keyed by embedding column name | No |
Table name rules
- Max length is 128 characters
- Must contain only alphanumeric characters and underscore
- Must start with an alpha character
createTable example
schema = [{'name': 'id', 'type': 'int16'},
{'name': 'tag', 'type': 'bool'},
{'name': 'author', 'type': 'str'},
{'name': 'length', 'type': 'int32'},
{'name': 'content', 'type': 'str'},
{'name': 'createdDate', 'type': 'datetime64[ns]'},
{'name': 'embeddings', 'type': 'float64s'}]
indexes = [
{'type': 'flat', 'name': 'flat', 'column': 'embeddings', 'params': {'dims': 1536}},
{'type': 'hnsw', 'name': 'fast_hnsw', 'column': 'embeddings', 'params': {'dims': 1536,'M': 8, 'efConstruction': 8}},
{'type': 'hnsw', 'name': 'accurate_hnsw','column': 'embeddings', 'params': {'dims': 1536,'M': 64, 'efConstruction':256}}
]
db = session.database("default")
db.create_table(table="myTable", schema=schema, indexes=indexes)
# create partitioned table
db.create_table(table="myPartitionedTable", schema=schema, indexes=indexes, partition_column='createdDate')
schema
Attributes:
Name | Type | Description | Required |
---|---|---|---|
name | str | Column name | Yes |
type | str | Column type | Yes |
Schema example
schema = [ { 'name': 'id', 'type': 'int32'}, { 'name': 'isValid', 'type': 'bool'},
{ 'name': 'embeddings', 'type': 'float32s' }, { 'name': 'sparse_col', 'type': 'general' } ]
indexes
Attributes:
Name | Type | Description | Required |
---|---|---|---|
name | str | Index name | Yes |
type | str | Index type, for example: flat, qFlat, hsnw, ivf, ivfpq, qhsnw | Yes |
column | str | kdb+ column name to apply index | Yes |
params | dict | Index parameters containing index-specific attributes for Flat, qFlat, HNSW, ivf, ivfpq, qHNSW | Yes |
Index example
indexes = [
{'type': 'flat', 'name': 'flat', 'column': 'embeddings', 'params': {'dims': 1536}},
{'type': 'hnsw', 'name': 'fast_hnsw', 'column': 'embeddings', 'params': {'dims': 1536, 'M': 8, 'efConstruction': 8}},
{'type': 'hnsw', 'name': 'accurate_hnsw','column': 'embeddings', 'params': {'dims': 1536, 'M': 64, 'efConstruction':256}}
]
flat
Index-specific attributes (params
) for type = flat
Attribute | Description | Type | Required | Default |
---|---|---|---|---|
dims | Dimension of vector space | int | Yes | N/A |
metric | Distance metric | str | No | L2 |
qFlat
Index-specific attributes (params
) for type = qFlat
Attribute | Description | Type | Required | Default |
---|---|---|---|---|
dims | Dimension of vector space | int | Yes | N/A |
metric | Distance metric | str | No | L2 |
hnsw
Index-specific attributes (params
) for type = hnsw
Attribute | Description | Type | Required | Default |
---|---|---|---|---|
dims | Dimension of vector space | int | Yes | N/A |
M | Graph valency | int | No | 8 |
efConstruction | Search depth at construction | int | No | 8 |
metric | Distance metric | str | No | L2 |
qHnsw
Index-specific attributes (params
) for type = qHnsw
Attribute | Description | Type | Required | Default |
---|---|---|---|---|
dims | Dimension of vector space | int | Yes | N/A |
M | Graph valency | int | No | 8 |
efConstruction | Search depth at construction | int | No | 8 |
metric | Distance metric | str | No | L2 |
ivf
Index-specific attributes (params
) for type = ivf
Attribute | Description | Type | Required | Default |
---|---|---|---|---|
nclusters | Number of clusters | long | No | 8 |
metric | Distance metric | str | No | L2 |
ivfpq
Index-specific attributes (params
) for type = ivfpq
Attribute | Description | Type | Required | Default |
---|---|---|---|---|
nclusters | Number of clusters | long | No | 8 |
nbits | Number of bits to quantize | long | No | 8 |
nsplits | Number of vectors to split | long | No | 8 |
metric | Distance metric | str | No | L2 |
external_data_references
Attributes:
Name | Type | Description | Required |
---|---|---|---|
path | byte str | Path to external table, for instance the existing kdb+ table mounted in our Docker container. | Yes |
provider | str | Provider of external table, for example kx . |
Yes |
external_data_references example
Launch the KDB.AI Server container with the -v
flag to mount an existing kdb+ DB in the container, for example:
docker run -it --e NUM_WRK=1 \
-e SECONDARIES=16 \
-e KDB_LICENSE_B64 \
-v $PWD/vecdb/data:/tmp/kx/data/vdb \
-v $PWD/taq/db:/tmp/kx/remote:ro \ <= mount a local ./taq/db under /tmp/kx/remote in the container as read-only
-p 8082:8082 \
kdbai-db:local
database.create_table("tq", external_data_references=[{'path': b'/tmp/kx/remote', 'provider': 'kx'}])
tq
) should match the name of the target table in the existing kdb+ db.
Error handling:
Description | Message | Troubleshooting |
---|---|---|
Success: Table is created and returned | success result`error!True;table_dictionary;" |
N/A |
Fail: Table name is not unique | Raise exception | Specify a different table name as it appears a table with this name already exists. |
Fail: Table name is not valid | Raise exception | Use a valid string for the table name. |
Fail: Any of the input parameters are of wrong type | ValueError: "invalid arguments types: " ... | Provide the correct type of input parameters required. |
Fail: Any of the input parameters are missing | ValueError: "missing arguments: " ... | Provide required input parameters. |
Fail: Any of the input parameters are invalid | ValueError: "invalid arguments: " ... | Provide known or valid input parameters. |
Fail: Schema individual attributes are not valid | ValueError: "invalid table attributes: " ... | Provide valid attributes in the schema. |
Fail: Schema individual types are not valid | ValueError: "invalid column types: " ... | Provide valid column types in the schema. |
Fail: Index individual parameters are not valid | ValueError: "invalid index parameters: " ... | Double check the parameters of one of the specified indexes. |
Error: KDBAI is not available | RuntimeError('Error during request, make sure KDB.AI server running') | Check your connection and if your server is running. |
Get Table
database.table
Retrieve a table from a database with a given name.
Input parameters:
Name | Type | Description | Required |
---|---|---|---|
database | str | Name of the database where the table is. | Yes |
table | str | Name of the table to be retrieved | Yes |
Example
db=session.database("default")
db.table("myTable")
Error handling:
Description | Message | Troubleshooting |
---|---|---|
Success: Table with given name is found | Table meta dictionary as Pandas DataFrame | N/A |
Error: KDBAI is not available | RuntimeError('Error during request, make sure KDB.AI server running') | Check your connection and if your server is running. |
Refresh table
table.refresh()
This method ensures that the table index and schema information associated with the table is current and calls getTable
function.
Example
table.refresh()
Error handling:
Description | Message | Troubleshooting |
---|---|---|
Success: Table is refreshed | None | N/A |
Error: KDBAI is not available | RuntimeError('Error during request, make sure KDB.AI server running') | Check your connection and if your server is running. |
List tables
database.tables
Retrieve a list of tables from a database with a given name.
Tables are cached on the database instance. As a result, the data might have changed since the last get or refresh.
Input parameters:
Name | Type | Description | Required |
---|---|---|---|
database | instance name | Name of the database where the table is. If you don't provide a database name, the default database is used. | No |
Example
db = session.database("myDatabase")
db.tables
Error handling:
Description | Message | Troubleshooting |
---|---|---|
Success: Tables found | List of table names | N/A |
Error: KDBAI is not available | RuntimeError('Error during request, make sure KDB.AI server running') | Check your connection and if your server is running. |
Delete Table
table.drop()
Delete a table with a given name and all associated indexes.
Input parameters:
Name | Type | Description | Required |
---|---|---|---|
database | N/A | Name of the database where the table is. | Yes |
table | instance name | Name of the table to be deleted. | Yes |
Example
db = session.database("default")
table = db.table("myTable")
table.drop()
Error handling:
Description | Message | Troubleshooting |
---|---|---|
Success: Table with given name has been deleted | N/A | N/A |
Error: KDBAI is not available | RuntimeError('Error during request, make sure KDB.AI server running') | Check your connection and if your server is running. |
Index
Retrieve and list indexes.
Get index
table.index
Retrieve an index from a table.
Input parameters:
Name | Type | Description | Required |
---|---|---|---|
database | str | Name of the database where the table is. If not provided, the default database is used. | Yes |
table | str | Name of the table where the index to be retrieved is. | Yes |
name | str | Name of the index to be retrieved | Yes |
Example
table.index('trade_flat_index')
Error handling:
Description | Message | Troubleshooting |
---|---|---|
Success: Index with given name is found and returned | dictionary | N/A |
Fail: Index name is not valid | ValueError: Index name is invalid | Provide a valid string for the index name. |
Fail: Index with given name is not found | ValueError: Index name is not found | Provide correct index name. |
Error: KDBAI is not available | RuntimeError('Error during request, make sure KDB.AI server running') | Check your connection and if your server is running. |
List indexes
table.indexes
List all indexes for a table.
Input parameters:
Name | Type | Description | Required |
---|---|---|---|
database | str | Name of the database where the table is. If not provided, the default database is used. | No |
table | str | Name of the table where the indexes to be retrieved are. | No |
Example
db = session.database("default")
table = db.table("myTable")
table.indexes
Error handling:
Description | Message | Troubleshooting |
---|---|---|
Success: Indexes found and returned | list of dictionaries | N/A |
Error: KDBAI is not available | Cannot write to handle ... | Check your connection and if your server is running. |
Update indexes
table.update_indexes
Build one or more indexes.
Allows to build indexes from scratch. Only supported for kdb+ HDB tables.
Input parameters:
Name | Type | Description | Required |
---|---|---|---|
database | N/A | The name of the database. If a database name is not provided,the default database is used. | Yes |
table | instance name | The name of the table. | Yes |
indexes | list | List of index names to build. | Yes |
parts | list | Partitions list to build index in case of partition database. If not given, then indexes will be built on all partitions. | Yes |
Example
db = session.database("default")
table = db.table("SEC")
table.update_indexes(indexes=["flat_index"])
Error handling:
Description | Message | Troubleshooting |
---|---|---|
Success: Index(es) with given name(s) updated successfully | None | N/A |
Fail: Operation called on a table managed by kdbai | KDBAIException: feature not supported: build index is only allowed on reference database | Use build index only on reference tables. |
Fail: Index name is not valid | ValueError: Index name is invalid | Provide a valid string for the index name |
Fail: Index with given name is not found | KDBAIException: index not found: invalid | Provide correct index name. |
Fail: Update operation is not valid | ValueError: Update operation is not valid | |
Error: KDBAI is not available | Cannot write to handle ... | Check your connection and if your server is running. |
Data
Insert, query, and search data.
Insert data
table.insert
Add rows to a table.
Input parameters:
Name | Type | Description | Required |
---|---|---|---|
database | str | The name of the database where the table is located. If a database name is not provided,the default database is used. | Yes |
table | str | The name of the table where the data will be inserted. | Yes |
payload | dataframe | Data to insert. | No - not required when using external database. |
Example
db = session.database("default")
table = db.table("myTable")
table.insert(data)
Error handling:
Description | Message | Troubleshooting |
---|---|---|
Success: Data inserted successfully. | dictionary | N/A |
Fail: Data table does not match with table schema. | KDBAIException: "data has wrong types: cols provided |
Check data schema and expected table schema. |
Error: KDBAI is not available | RuntimeError('Error during request, make sure KDB.AI server running') | Check your connection and if your server is running. |
Train data
table.train
Train data.
Input parameters:
Name | Type | Description | Required |
---|---|---|---|
database | str | The name of the database where the table is located. | Yes |
table | str | The name of the table. | Yes |
payload | table | Data to insert. | Yes |
Example
db = session.database("default")
table = db.table("myTable")
table.train(payload=data)
Error handling:
Description | Message | Troubleshooting |
---|---|---|
Success: Index(es) with given name(s) updated successfully | True | N/A |
Fail: Index name is not valid | ValueError: Index name is invalid | Provide a valid string for the index name. |
Fail: Index with given name is not found | ValueError: Index name is not found | Provide correct index name. |
Error: KDBAI is not available | RuntimeError('Error during request, make sure KDB.AI server running') | Check your connection and if your server is running. |
Query data
table.query
Query data from a table.
Input parameters:
Name | Type | Description | Required |
---|---|---|---|
database | str | The name of the database where the table you are querying data from is located. | Yes |
table | str | The name of the table you are querying data from. | Yes |
filter | list of tuples | List of filter conditions, parse tree style. | No |
sort_columns | list of str | The columns by which to sort the results. | No |
group_by | list of str | The column values by which to group the results. | No |
aggs | dictionary | Aggregation rules. Dictionary structure: - Key → new column name - Value → old column name or parse tree style aggregation rule |
No |
limit | int | Number of rows to return. | No |
Example
db = session.database("default")
table = db.table("myTable")
table.query()
Error handling:
Description | Message | Troubleshooting |
---|---|---|
Success: Successful query | Pandas DataFrame | N/A |
Error: KDBAI is not available | RuntimeError('Error during request, make sure KDB.AI server running') | Check your connection and if your server is running. |
Search data
table.search
Perform a similarity search.
Input parameters:
Name | Type | Description | Required |
---|---|---|---|
database | str | Database name | Yes |
table | str | Table name. | Yes |
type | str | Specify the type of search (tss or otherwise). | No |
vectors | dictionary | Indexes to query with query vectors. | Yes |
n | int | Number of neighbors to return. | Yes |
index_params | dictionary (key is index name and value is dictionary of parameters for that index) | Weights required for multi index search. | No |
options | dictionary | Use this dictionary: - to rename the distance column with distanceColumn=newname - to not return metadata columns with indexOnly=True - to return TSS matched patterns with returnMatches=True - to force a TSS search on a partitioned tables with failing partitions with force=True |
Yes |
filter | list of tuples | List of filter conditions, parse tree style. | No |
sort_columns | list of str | The columns by which to sort the results. | No |
group_by | list of str | The column values by which to group the results. | No |
aggs | dictionary | Aggregation rules. | No |
Example
db = session.database("default")
table = db.table("myTable")
table.search(vectors={"indexName":v},n=10)
options
Attribute | Description | Type | Required | Default |
---|---|---|---|---|
distanceColumn | Rename distance column to this. | str | No | None |
indexOnly | Return only index information | bool | No | None |
index_params
index_params
is a dictionary where key is index name and value is a dictionary with the arguments below .
Attribute | Description | Type | Required | Default |
---|---|---|---|---|
weight | Weight for each index. | float | Required for multi index input. | None |
Important! For multi index searches, you have to allocate a weight to each index. The sum of all weights must be equal to 1.
Error handling:
Description | Message | Troubleshooting |
---|---|---|
Success: Successful query | list of Pandas DataFrames | N/A |
Error: KDBAI is not available | RuntimeError('Error during request, make sure KDB.AI server running') | Check your connection and if your server is running. |