Skip to content

Hugging Face

This section explains how to integrate KDB.AI with Hugging Face.

Hugging Face is a platform where you can access models, datasets, and applications for Machine Learning (ML) workloads. Hugging Face allows you to create, train, deploy ML models or fine-tune pre-trained models. Notable tools include the Transformers model library, pipelines for performing ML tasks, and collaborative resources.

The KDB.AI Hugging Face integration guide helps you execute a wide variety of tasks with your KDB.AI vector database, including detect objects, summarize documents, answer questions, generate text, translate content, and convert text to speech.

Why use Hugging Face for embeddings?

When building production applications that utilize embeddings, using open-source embedding models has the following advantages:

  • Control: reduces dependence on third-party embedding providers.
  • Local embedding: you can create embeddings locally, which is useful for embedding your dataset.
  • Scalability: KDB.AI's vector database can handle large-scale data and search operations efficiently.
  • Flexibility: you can experiment with different embedding models and configurations to optimize performance for your specific use cases.

A common approach is to use a Python framework like sentence-transformers, developed by Hugging Face, which offers state-of-the-art sentence, text, and image embeddings. Here's a typical workflow:

  • Embed your dataset locally: use a library like FastEmbed (built on top of Hugging Face's transformers library, optimized for speed) to embed your dataset, which might consist of AI tools and associated metadata.
  • Embed queries at inference time: when a user submits a query, use an external service like Hugging Face's Inference API to embed the query. This eliminates the need to deploy your own model, allowing you to leverage a fully optimized external service.

By following this approach, you can build a system that searches through hundreds of AI tools without the need to deploy any infrastructure (and scale to millions!). Additionally, since you embed the dataset locally, you can use Hugging Face's free plan without requiring a credit card or worrying about hitting rate limits, at least until you are ready for production.

Getting started

Before you integrate KDB.AI with Hugging Face, you need to have the following:


To complete the KDB.AI integration, you can use the Hugging Face Inference API endpoints to generate high-quality embeddings and store/index them in the vector database.

  • Sign Up/Sign In to Hugging Face and verify your account.
  • Log in and go to Avatar -> Settings.
  • Select Access Tokens -> click New token
  • Give the token name and Write type.
  • Click Generate token.
  • Copy the generated token and use in the script below as HF_TOKEN
  • Install dependencies:
    !pip install kdbai_client fastembed
  • Import packages:
    # vector DB
    import os
    from getpass import getpass
    import kdbai_client as kdbai
    import time
    import numpy as np
    import pandas as pd

1. Connect to KDB.AI

KDB.AI Cloud is for experimenting with smaller generative AI projects with a vector database in our cloud.

To use KDB.AI Cloud, you need to two session details - a URL endpoint and an API key, both available in your KDB.AI Cloud portal.

if "KDBAI_ENDPOINT" in os.environ
else input("KDB.AI endpoint: ")
if "KDBAI_API_KEY" in os.environ
else getpass("KDB.AI API key: ")
# Insert your Hugging Face token
if "HF_TOKEN" in os.environ
else getpass("Hugging Face token: ")
# Define your KDB.AI session
session = kdbai.Session(endpoint=KDBAI_ENDPOINT, api_key=KDBAI_API_KEY)

KDB.AI Server is for evaluating large scale generative AI applications on premises or on your own cloud provider.

To use KDB.AI Server, you need download and run your own container. Follow the instructions in the signup email to get your session up and running, then passing your local endpoint:

session = kdbai.Session(endpoint="http://localhost:8082")

Verify defined tables - you can check your connection using the session.list() function. This returns a list of all the tables you have defined in your vector database thus far:

# ensure no table called "ai_tools" exists
except kdbai.KDBAIException:
If you're just starting out, it should return an empty list:

2. Create table

To create a table in KDB.AI, use the create_table function, which takes two arguments: name and schema. This schema must meet the following criteria:

  • Must contain a list of columns.
  • All columns must have a pytype specified, except the vectors column.
  • One column of vector embeddings may also have a vectorIndex attribute with the configuration of the index for similarity search. This column is implicitly an array of float32.

Define schema - to create a table with two columns, you can use, for example, the following columns:

  • id with a list of dummy IDs
  • vector embeddings to use for similarity search later on.

Next, you need to define dimensionality, similarity metric, and index type with the vectorIndex attribute. You can use:

  • dims = 384: In the next section, you generate embeddings that are eight-dimensional to match this. You can chose any value here.
  • metric = L2: Stands for L2/Euclidean distance. You can also use IP/Inner Product and CS/Cosine Similarity, depending on the specific context and nature of your data.
  • type = flat: We use a Flat index, but you can go for HNSW and IVFPQ, depending on the data and your performance requirements.

schema = {
    "columns": [
        {"name": "id", "pytype": "str"},
        {"name": "name", "pytype": "str"},
        {"name": "description", "pytype": "str"},
        {"name": "summary", "pytype": "str"},
        {"name": "title", "pytype": "str"},
        {"name": "visitors", "pytype": "int32"},
        {"name": "description_embedding", "vectorIndex": {"dims": 384, "metric": "L2", "type": "flat"}},
Create table:

table = session.create_table("ai_tools", schema) 

3. Add data to table

First, generate a vector of five 8-dimensional vectors - they'll be the vector embeddings. Next, add to pandas dataframe with column names/types matching the target table:

import requests

gist_url = ""
response = requests.get(gist_url)
ai_tools_data = response.json()
df = pd.DataFrame.from_dict(ai_tools_data)
df.drop(columns=["xata"], inplace=True)
Use the FastEmbed library to embed every description in the dataset:

from fastembed import TextEmbedding

embedding_model = TextEmbedding()

descriptions = [tool["description"] for tool in ai_tools_data]
embeddings = list(embedding_model.embed(descriptions))
Insert the data into your KDB.AI table:

# Create a DataFrame with the AI tools data
data = pd.DataFrame(ai_tools_data)[["id", "name", "description", "summary", "title", "visitors"]]
data["description_embedding"] = embeddings

# Bulk insert the data into KDB.AI

4. Search with Hugging Face

Use the Hugging Face Inference API to embed the query so that you can use it to search your index:

# Perform a similarity search using Hugging Face embeddings
import requests

# Make sure your URL looks like this to ensure you get instant results, and not a model loading error
embedding_url = ""

def waitForResourceAvailable(response, timeout_seconds):
    timer = 0
    while response.status_code == 204:
        timer += 10
        if timer > timeout_seconds:
        if response.status_code == 200:

def generate_query_embedding(text: str) -> list[float]:
    response =
        headers={"Authorization": f"Bearer {HF_TOKEN}"},
        json={"inputs": text}
    waitForResourceAvailable(response, 5)
    if response.status_code != 200:
       raise ValueError(f"Request failed with status code {response.status_code}: {response.text}")
    return response.json()
# Sometimes you might get a status code 503 (Unavailable server error response code indicates that the server is not ready to handle the request)

query = "AI tool for creating 3D textures"
query_embedding = generate_query_embedding(query)

results =
Once you are finished with your searches, it is recommended to delete the KDB.AI Table to conserve resources:

Example #1: Use KDB.AI and Hugging Face for Transfer Learning

In this Image Search on Brain MRI Scans example, we take a model that has been pre-trained for a task (ResNet-50 for ImageNet classification) and use it as a starting point to solve a more specific problem. To create our image embeddings, we used a neural network that has been pre-trained on the brain tumor classification problem.

Example #2: Use a LLM model from Hugging Face to execute RAG

This Retrieval Augmented Generation (RAG) with LangChain notebook demonstrates how to use an advanced prompt engineering technique called Retrieval Augmented Generation (RAG), with hands-on examples using Langchain, KDB.AI and various LLMs.

Example #3: Use Hugging Face with KDB.AI to create a AI tool search engine

This notebook walks you through the process of embedding a dataset of AI tools using FastEmbed (a lightweight, fast, Python library built for embedding generation,) storing the embeddings in a KDB.AI table, and then using Hugging Face's Inference API to embed queries at inference time. This enables efficient and scalable similarity search capabilities.


Whether you're building a semantic search engine, a recommendation system, or any application that relies on finding similar items, the KDB.AI integration with Hugging Face provides a powerful and flexible solution.

Now that you have successfully configured the integration you can achieve the following:

Next steps