Performing Searches
This section contains details of how to execute similarity searches. For more advanced details on filtered similarity searches, see here. Similarity searches in KDB.AI are based on (approximate) nearest neighbor algorithms.
Selecting the table to search
Each table in KDB.AI has an associated name. In order to perform a search, specify the table in which the relevant vector embeddings are stored. Using the python client you can create a table object from the session.
documents = session.table("documents")
Searching
Now given a new vector embedding you can perform a search for the nearest neighbors. In this example, the embeddings are assumed to be 8 dimensional and the number of nearest neighbours is set to 3.
documents.search(vectors=[[1.0,0.0,1.0,1.0,0.0,1.0,1.0,0.0,1.0,1.0,0.0,1.0]], n=3)
Batch searches
For larger workloads it can be helpful to send multiple query vectors at once.
documents.search(vectors=[[1.0,0.0,1.0,1.0,0.0,1.0,1.0,0.0,1.0,1.0,0.0,1.0],[1.0,7.0,1.0,1.0,7.0,1.0,1.0,7.0,1.0,1.0,7.0,1.0]], n=3)
Processing results
It is possible to return a subset of the columns in the table reducing the amount to data sent back to the client.
documents.search(vectors=[[1.0,0.0,1.0,1.0,0.0,1.0,1.0,0.0,1.0,1.0,0.0,1.0]], n=3, aggs=[["author"],["context"]])
In addition to returning a subset of the columns the user can return aggregated resuts, group by categorical variables, and sort based on a column name.
documents.search(vectors=[[1.0,0.0,1.0,1.0,0.0,1.0,1.0,0.0,1.0,1.0,0.0,1.0]], n=3, aggs=[('sumLength','sum','length')], group_by=['author'], sort_by=['sumLength'])