Skip to content

Python API Client

This section contains the API references for the Python client to KDB.AI. For example usage, see the Quickstart Guide.

Session

Session represents a connection to a KDB.AI instance.

Parameters:

Name Type Description Default
api_key str

API Key to be used for authentication.

None
endpoint str

Server endpoint to connect to.

'http://localhost:8082'
Example

Open a session on KDB.AI Cloud with an api key:

session = Session(endpoint='YOUR_INSTANCE_ENDPOINT', api_key='YOUR_API_KEY')

Open a session on a custom KDB.AI instance on http://localhost:8082:

session = kdbai.Session(endpoint='http://localhost:8082')

list

Retrieve the list of tables.

Returns:

Type Description
List[str]

A list of strings with the names of the existing tables.

Example
session.list()
["trade", "quote"]

table

Retrieve an existing table which was created in the previous session.

Parameters:

Name Type Description Default
name str

Name of the table to retrieve.

required

Returns:

Type Description
Table

A Table object representing the KDB.AI table.

Example

Retrieve the trade table:

session1 = kdbai.Session(endpoint='http://localhost:8082') # Previous session
table1 = session1.create_table('trade1', schema)           # Create table 'trade1'

session2 = kdbai.Session(endpoint='http://localhost:8082') # Current session
table2 = session2.table("trade1")                          # Retrieve table 'trade1'

create_table

Create a table with a schema

Parameters:

Name Type Description Default
name str

Name of the table to create.

required
schema dict

Schema of the table to create. This schema must contain a list of columns. All columns must have either a pytype or a qtype specified except the column of vectors. One column of vector embeddings may also have a vectorIndex attribute with the configuration of the index for similarity search - this column is implicitly an array of float32.

required

Returns:

Type Description
Table

A newly created Table object based on the schema.

Raises:

Type Description
KDBAIException

Raised when a error happens during the creation of the table.

Example Flat Index
schema = {'columns': [{'name': 'id', 'pytype': 'str'},
                      {'name': 'tag', 'pytype': 'str'},
                      {'name': 'text', 'pytype': 'bytes'},
                      {'name': 'embeddings',
                       'vectorIndex': {'dims': 1536, 'metric': 'L2', 'type': 'flat'}}]}
table = session.create_table('documents', schema)
Example IVF Index
schema = {'columns': [{'name': 'id', 'pytype': 'str'},
                      {'name': 'tag', 'pytype': 'str'},
                      {'name': 'text', 'pytype': 'bytes'},
                      {'name': 'embeddings',
                       'vectorIndex': {'trainingVectors': 1000,
                                       'metric': 'CS',
                                       'type': 'ivf',
                                       'nclusters': 10}}]}
table = session.create_table('documents', schema)
Example IVFPQ Index
schema = {'columns': [{'name': 'id', 'pytype': 'str'},
                      {'name': 'tag', 'pytype': 'str'},
                      {'name': 'text', 'pytype': 'bytes'},
                      {'name': 'embeddings',
                       'vectorIndex': {'trainingVectors': 5000,
                                       'metric': 'L2',
                                       'type': 'ivfpq',
                                       'nclusters': 50,
                                       'nsplits': 8,
                                       'nbits': 8}}]}
table = session.create_table('documents', schema)
Example HNSW Index

```python schema = {'columns': [{'name': 'id', 'pytype': 'str'}, {'name': 'tag', 'pytype': 'str'}, {'name': 'text', 'pytype': 'bytes'}, {'name': 'embeddings', 'vectorIndex': {'dims': 1536, 'metric': 'IP', 'type': 'hnsw', 'efConstruction' : 8, 'M': 8}}]} table = session.create_table('documents', schema)

Example Sparse Index

```python schema = {'columns': [{'name': 'id', 'pytype': 'str'}, {'name': 'tag', 'pytype': 'str'}, {'name': 'text', 'pytype': 'bytes'}, {'name': 'embeddings', 'sparseIndex': {'k': 1.25, 'b': 0.75}}]} table = session.create_table('documents', schema)

Example Flat + Sparse Indexes:

schema = {'columns': [{'name': 'id', 'pytype': 'str'},
                      {'name': 'tag', 'pytype': 'str'},
                      {'name': 'text', 'pytype': 'bytes'},
                      {'name': 'denseCol',
                       'vectorIndex': {'dims': 1536,
                                       'metric': 'L2',
                                       'type': 'flat'}},
                      {'name': 'sparseCol',
                       'sparseIndex': {'k': 1.25,
                                       'b': 0.75}}]}
table = session.create_table('documents', schema)

Table

KDB.AI table.

Table object shall be created with session.create_table(...) or retrieved with session.table(...). This constructor shall not be used directly.

schema

Retrieve the schema of the table.

Raises:

Type Description
KDBAIException

Raised when an error occurs during schema retrieval

Returns:

Type Description
Dict

A dict containing the table name and the list of column names and appropriate numpy datatypes.

Example
table.schema()

{'columns': [{'name': 'id', 'pytype': 'str', 'qtype': 'symbol'},
              {'name': 'tag', 'pytype': 'str', 'qtype': 'symbol'},
              {'name': 'text', 'pytype': 'bytes', 'qtype': 'string'},
              {'name': 'embeddings',
               'pytype': 'float32',
               'qtype': 'reals',
               'vectorIndex': {'dims': 1536, 'metric': 'L2', 'type': 'flat'}}]}

train

Train the index (IVF and IVFPQ only).

Parameters:

Name Type Description Default
data DataFrame

Pandas dataframe with column names/types matching the target table.

required
warn bool

If True, display a warning when data has a trivial which will be dropped before training.

True

Returns:

Type Description
str

A string containing the status after training

Examples:

from datetime import timedelta
from datetime import datetime

ROWS = 50
DIMS = 10

data = {
    "time": [timedelta(microseconds=np.random.randint(0, int(1e10))) for _ in range(ROWS)],
    "sym": [f"sym_{np.random.randint(0, 999)}" for _ in range(ROWS)],
    "realTime": [datetime.utcnow() for _ in range(ROWS)],
    "price": [np.random.rand(DIMS).astype(np.float32) for _ in range(ROWS)],
    "size": [np.random.randint(1, 100) for _ in range(ROWS)],
}
df = pd.DataFrame(data)
table.train(df)

Raises:

Type Description
KDBAIException

Raised when an error occurs during training.

insert

Insert data into the table.

Parameters:

Name Type Description Default
data DataFrame

Pandas dataframe with column names/types matching the target table.

required
warn bool

If True, display a warning when data has a trivial which will be dropped before insertion.

True

Returns:

Type Description
bool

A boolean which is True if the insertion was successful.

Examples:

ROWS = 50
DIMS = 10

data = {
    "time": [timedelta(microseconds=np.random.randint(0, int(1e10))) for _ in range(ROWS)],
    "sym": [f"sym_{np.random.randint(0, 999)}" for _ in range(ROWS)],
    "realTime": [datetime.utcnow() for _ in range(ROWS)],
    "price": [np.random.rand(DIMS).astype(np.float32) for _ in range(ROWS)],
    "size": [np.random.randint(1, 100) for _ in range(ROWS)],
}
df = pd.DataFrame(data)
table.insert(df)

Raises:

Type Description
KDBAIException

Raised when an error occurs during insert.

query

Query data from the table.

Parameters:

Name Type Description Default
filter Optional[List[list]]

A list of filter conditions as triplets in the following format: [['function', 'column name', 'parameter'], ... ] See all filter operators here

None
group_by Optional[str]

A list of column names to use for group by.

None
aggs Optional[List[list]]

Either a list of column names to select or a list of aggregations to perform as a list of triplers in the following form: [['output_column', 'agg_function', 'input_column'], ... ] See all aggregation functions here

None
sort_by Optional[List[str]]

List of column names to sort on.

None
fill Optional[str]

This defines how to handle null values. This should be either 'forward' or 'zero' or None.

None

Returns:

Type Description
DataFrame

Pandas dataframe with the query results.

Examples:

table.query(group_by = ['sensorID', 'qual'])
table.query(filter = [['within', 'qual', [0, 2]]])

# Select subset of columns
table.query(aggs=['size'])
table.query(aggs=['size', 'price'])

Raises:

Type Description
KDBAIException

Raised when an error occurs during query.

search

Perform similarity search on the table, supports dense or sparse queries.

Parameters:

Name Type Description Default
vectors List[list] | List[dict]

Query vectors for the search.

required
n int

Number of neighbours to return.

1
index_options dict

Index specific options for similarity search.

None
distances str

Optional name of a column to output the distances. If not specified, __nn_distance will be added as an extra column to the result table.

None
filter Optional[List[list]]

A list of filter conditions as triplets in the following format: [['function', 'column name', 'parameter'], ... ] See all filter operators here

None
group_by Optional[str]

A list of column names to use for group by.

None
aggs Optional[List[list]]

Either a list of column names to select or a list of aggregations to perform as a list of triplers in the following form: [['output_column', 'agg_function', 'input_column'], ... ] See all aggregation functions here

None
sort_by Optional[List[str]]

List of column names to sort on.

None

Returns:

Type Description
List[DataFrame]

List of Pandas dataframes with one dataframe of matching neighbors for each query vector.

Examples:

#Find the closest neighbour of a single (dense) query vector
table.search(vectors=[[0,0,0,0,0,0,0,0,0,0]], n=1)

#Find the closest neighbour of a single (sparse) query vector
table.search(vectors=[{101:1,4578:1,102:1}], n=1)

#Find the 3 closest neighbours of 2 query vectors
table.search(vectors=[[0,0,0,0,0,0,0,0,0,0], [1,1,1,1,1,1,1,1,1,1]], n=3)

# With aggregation and sorting
table.search(vectors=[[0,0,0,0,0,0,0,0,0,0],[1,1,1,1,1,1,1,1,1,1]],
n=3,
aggs=[['sumSize','sum','size']],
group_by=['sym'],
sort_by=['sumSize'])

# Returns a subset of columns for each match
table.search(vectors=[[0,0,0,0,0,0,0,0,0,0],[1,1,1,1,1,1,1,1,1,1]], n=3, aggs=['size'])
table.search(vectors=[[0,0,0,0,0,0,0,0,0,0],[1,1,1,1,1,1,1,1,1,1]], n=3, aggs=['size', 'price'])

# Filter
table.search(vectors=[[0,0,0,0,0,0,0,0,0,0],[1,1,1,1,1,1,1,1,1,1]],
n=3,
filter=[['within','size',(5,999)],['like','sym','AAP*']])

# Customized distance name
table.search(vectors=[[0,0,0,0,0,0,0,0,0,0],[1,1,1,1,1,1,1,1,1,1]],
n=3,
distances='myDist')

# Index options
table.search(vectors=[[0,0,0,0,0,0,0,0,0,0],[1,1,1,1,1,1,1,1,1,1]],n=3,index_options=dict(efSearch=512))
table.search(vectors=[[0,0,0,0,0,0,0,0,0,0],[1,1,1,1,1,1,1,1,1,1]],n=3,index_options=dict(clusters=16))

Raises:

Type Description
KDBAIException

Raised when an error occurs during search.

Perform hybrid search on the table.

Parameters:

Name Type Description Default
dense_vectors list of lists

Dense query vectors for the search.

required
sparse_vectors list of dicts

Sparse query vectors for the search.

required
n int

Number of neighbours to return.

1
dense_index_options dict

Index specific options for similarity search.

None
sparse_index_options dict

Index specific options for similarity search.

None
alpha float

Weight of strategy in [0,1], 0 sparse vs 1 dense

0.5
distances str

Optional name of a column to output the distances. If not specified, __nn_distance will be added as an extra column to the result table.

None
filter Optional[List[list]]

A list of filter conditions as triplets in the following format: [['function', 'column name', 'parameter'], ... ] See all filter operators here

None
group_by Optional[str]

A list of column names to use for group by.

None
aggs Optional[List[list]]

Either a list of column names to select or a list of aggregations to perform as a list of triplers in the following form: [['output_column', 'agg_function', 'input_column'], ... ] See all aggregation functions here

None
sort_by Optional[List[str]]

List of column names to sort on.

None

Returns:

Type Description
List[DataFrame]

List of Pandas dataframes with one dataframe of matching neighbors for each query vector.

Raises:

Type Description
KDBAIException

Raised when an error occurs during search.

Examples:

# Find the closest neighbour of a single hybrid query vector
table.hybrid_search(dense_vectors=[[0,0,0,0,0,0,0,0,0,0]],
                    sparse_vectors=[{101:1,4578:1,102:1}],
                    n=1)

# Find the 3 closest neighbours for 2 hybrid queries
table.hybrid_search(dense_vectors=[[0,0,0,0,0,0,0,0,0,0],[1,1,1,1,1,1,1,1,1,1]],
                    sparse_vectors=[{101:1,4578:1,102:1},{101:1,6079:2,102:1}],
                    n=3)

# Weight the sparse leg of the query higher setting alpha = 0.1
table.hybrid_search(dense_vectors=[[0,0,0,0,0,0,0,0,0,0]],
                    sparse_vectors=[{101:1,4578:1,102:1}],
                    alpha=0.1,
                    n=1)

# Filter
table.hybrid_search(dense_vectors=[[0,0,0,0,0,0,0,0,0,0]],
                    sparse_vectors=[{101:1,4578:1,102:1}],
                    n=1,
                    filter=[['within','size',(5,999)],['like','sym','AAP*']])

# Index options
table.hybrid_search(dense_vectors=[[0,0,0,0,0,0,0,0,0,0]],
                    sparse_vectors=[{101:1,4578:1,102:1}],
                    n=1,
                    dense_index_options=dict(efSearch=521),
                    sparse_index_options={'k':1.4,'b':0.78})

# Customized distance name
table.hybrid_search(dense_vectors=[[0,0,0,0,0,0,0,0,0,0]],
                    sparse_vectors=[{101:1,4578:1,102:1}],
                    n=1,
                    distances='myDist')

drop

Drop the table.

Returns:

Type Description
bool

A boolean which is True if the table was successfully dropped.

Examples:

table.drop()

Raises:

Type Description
KDBAIException

Raised when an error occurs during the table deletion.

KDBAIException

Bases: Exception

KDB.AI exception.