Skip to content

Manage Tables

This section provides details on managing tables within the KDB.AI vector database.

In KDB.AI, the table is the fundamental structure for storing and organizing your data. Each table not only holds the actual vector data, but also includes crucial metadata that defines how the data is indexed and organized.

To create a table object, use the create_table function, which allows you to specify various attributes and settings for your table. Alternatively, you can retrieve an existing table using table function, which provides access to previously defined tables within your session.

Information on available tables

After you have connected with your KDB.AI session, you can retrieve information about the existing tables.

session.list()
curl -s localhost:8082/api/v1/config/table

Schema

Before defining a new table, you must first design the schema. This is defined as a python dictionary that contains a list of columns. For each column you need to define the name and that it's a pytype.

The column name must be unique within a table. In addition, avoid using the reserved column names date, int or the prefix label_.

The vector embeddings column should contain a vectorIndex attribute with the configuration of the index for similarity search; this column is implicitly an array of float32.

You must define the similarity metric and index type with the vectorIndex attribute.

Parameter Description
metric The choice metric depends on the specific context and nature of your data. See available metrics in KDB.AI here.
type Like metrics, the one you choose depends on your data and your overall performance requirements. See available indexes in KDB.AI here.

Depending on the choice of an index, there are additional parameters specific to that index that require configuration. For more information about these parameters and their default values, please refer to the dedicated index section.

schema = {'columns': [
     {'name': 'id', 'pytype': 'int16'},
     {'name': 'tag', 'pytype': 'bool'},
     {'name': 'author', 'pytype': 'str'},
     {'name': 'length', 'pytype': 'int32'},
     {'name': 'content', 'pytype': 'str'},
     {'name': 'createdDate', 'pytype': 'datetime64[ns]'},
     {'name': 'embeddings',
         'vectorIndex': {'dims': 12, 'type': 'hnsw', 'metric': 'L2', 'efConstruction': 8, 'M': 8}}]}
{
 "type": "splayed",
 "columns": [
    {"name": "id", "type": "short"},
    {"name": "tag", "type": "boolean"},
    {"name": "author", "type": "char"},
    {"name": "length", "type": "int"},
    {"name": "content", "type": "char"},
    {"name": "createdDate", "type": "timestamp"},
    {
     "name": "embeddings",
     "type": "reals",
     "vectorIndex": {
        "dims": 12,
        "type": "hnsw",
        "metric": "L2",
        "efConstruction": 8,
        "M": 8
     }
    }
   ]
  }

This helps you create a Sparse Index or conduct a Hybrid Search, a Transformed Temporal Similarity Search, and a Non-Transformed Temporal Similarity Search.

Create table

You can create multiple tables to suit your data organization needs. However, it's important that each table has only one vector column dedicated to storing vector data. In some cases, it's possible to create tables without an index column.

When creating a new table, the associated index is also automatically generated based on the configuration provided on the vectorIndex attribute. To set up a table after defining its schema, provide the desired table name along with the schema details specified above.

documents = session.create_table("documents",schema)
curl -H "Content-Type: application/json" -d @schemaAbove.json localhost:8082/api/v1/config/table/documents

Schema creation calls are synchronous; there might be a slight delay as you wait for the successful creation of multiple tables.

The parameters provided to the vectorIndex attribute for index initialization are immutable and cannot be modified after the table has been created. It's recommended to check your parameters before submitting.

Table configuration

The schema and index parameters that have been explicitly chosen during the table creation appear in the table configuration.

documents.schema()
curl -s localhost:8082/api/v1/config/table/documents

Delete table

Deleting a table deletes all the data together with the associated index.

You can delete a table in the KDB.AI Cloud UI from the Tables section, using the trash icon. You can only delete a complete table, not row level data.

Alternatively, from the command line, use:

documents.drop()
curl -s -X DELETE localhost:8082/api/v1/config/table/documents

Do not perform this action on a production database or in any environment where data deletion is not intended, because this action cannot be reverted.

Clean data

This action only applies to KDB.AI Server users.

To clean the data after stopping KDB.AI, run the following:

$ sudo rm -Rfv vecdb/data/* vecdb/logs/* 
$ rm -Rfv vecdb/data/.assembly.* 

Next steps

Once you have some tables created and your schema ready, you can do the following: