Vector IO
This section explains how to integrate KDB.AI with Vector IO.
Vector IO is a library that makes it easy to migrate your data from different vector databases into KDB.AI, and uses a universal dataset format (VDF) to preserve data on disk. By integrating KDB.AI with Vector IO you can access three main functionalities:
- export
- import
- re-embed data.
Workflow
Try out this integration by following the instructions below.
Prerequisites
Ensure you have an active license for KDB.AI. You can easily sign up for free here.
Connect to KDB.AI
Enter the KDB.AI endpoint and your API Key.
You can follow instructions on obtaining your KDB.AI license and API key here.
from getpass import getpass
import kdbai_client as kdbai
KDBAI_ENDPOINT = getpass("KDB.AI endpoint: ")
KDBAI_API_KEY = getpass("KDB.AI API key: ")
session = kdbai.Session(api_key=KDBAI_API_KEY, endpoint=KDBAI_ENDPOINT)
Verify data
Enter the name of the KDB.AI table you wish to export, to check if it exists and has data. For instance, if your table is called openai_pdf
, you run the following command:
table = session.table("openai_pdf")
table.query()
The response displays your table and the numbers of rows and columns.
To learn more about adding/managing your data in KDB.AI, go to the ingest data and manage tables pages.
Export data
To export data with Vector IO, run:
%run src/export_vdf.py kdbai
Export to disk completed. Exported to: vdf_20240205_101022_82f0d/
Time taken to export data: 00:00:20
Import data
To import table data into KDB.AI using Vector IO, run:
%run src/import_vdf.py -d "vdf_20240205_101022_82f0d" kdbai
The response is:
Table created
Inserted 250 out of 591 rows.
Inserted 500 out of 591 rows.
Inserted 591 out of 591 rows.
Data fully added
Time taken: 37.84 seconds
To confirm the table exists and data is available, run:
table.query()
Next steps
Now you have successfully integrated your data, you can use the other functions in KDB.AI, such as query and search.