Manage Tables
This section provides comprehensive guidance on effectively managing tables within the kdbai
vector database.
In KDB.AI
, the table serves as the fundamental structure for storing and organizing your data. Each table not only holds the actual vector data but also includes crucial metadata that defines how the data is indexed and organized.
To create a table object, you can utilize the create_table
function, which allows you to specify various attributes and settings for your table. Alternatively, you can retrieve an existing table using table
functions, which provides access to previously defined tables within your session.
Information on available tables
After you have connected with your KDB.AI session you can retrieve information about the existing tables.
session.list()
Schema
Before defining a new table, you must first design the schema. This is defined as a python dictionary that contains a list of columns. For each column you need to define the name and either a pytype
or a qtype
. The vector embeddings column should contain a vectorIndex
attribute with the configuration of the index for similarity search; this column is implicitly an array of float32.
You must define the dimensionality, similarity metric and index type with the vectorIndex
attribute.
Parameter | Description |
---|---|
metric |
The choice metric depends on the specific context and nature of your data. See available metrics in kdbai here. |
type |
Like metrics the one you choose depends your data and your overall performance requirements. See available indexes in kdbai here. |
Depending on the choice of an index, there are additional parameters specific to that index that require configuration. For more detailed information about these parameters and their default values, please refer to the dedicated index section.
schema = {'columns': [
{'name': 'id', 'pytype': 'str'},
{'name': 'tag', 'pytype': 'boolean'},
{'name': 'author', 'pytest': 'str'},
{'name': 'length', 'pytype': 'int'},
{'name': 'context', 'pytype': 'str'},
{'name': 'description', 'pytype': 'str'},
{'name': 'createdDate', 'pytype': 'time'},
{'name': 'embeddings',
'vectorIndex': {'dims': 12, 'type': 'hnsw', 'metric': 'L2', 'efConstruction': 8, 'M': 8}}]}
Create Table
Within a single session, you have the flexibility to create multiple tables to suit your data organization needs. However, it's important to note that each table can have only one vector column dedicated to storing vector data.
In certain cases, it is also possible to create tables without an index column.
When creating a new table, the associated index is also automatically generated based on the configuration provided on the vectorIndex
attribute. Setting up a table after defining its schema is a straightforward process. Simply provide the desired table name along with the schema details specified above.
documents = session.create_table("documents",schema)
Schema creation calls are synchronous, there might be a slight delay as you wait for the successful creation of multiple tables.
The parameters provided to the vectorIndex attribute for index initialization are immutable and cannot be modified after the table has been created. It is recommended to check your parameters before submitting.
Table configuration
The schema and index parameters that have been explicitly chosen during the table creation appear in the table configuration.
documents.schema()
Drop
Deleting a table deletes all the data together with the associated index.
documents.drop()
Do not perform this action on a production database or in any environment where data deletion is not intended, because this action cannot be reverted.