Hugging Face
This section explains how to integrate KDB.AI with Hugging Face.
Hugging Face is a platform where you can access models, datasets, and applications for Machine Learning (ML) workloads. Hugging Face allows you to create, train, deploy ML models or fine-tune pre-trained models. Notable tools include the Transformers model library, pipelines for performing ML tasks, and collaborative resources.
The KDB.AI Hugging Face integration guide helps you execute a wide variety of tasks with your KDB.AI vector database, including detect objects, summarize documents, answer questions, generate text, translate content, and convert text to speech.
Why use Hugging Face for embeddings?
When building production applications that utilize embeddings, using open-source embedding models has the following advantages:
- Control: reduces dependence on third-party embedding providers.
- Local embedding: you can create embeddings locally, which is useful for embedding your dataset.
- Scalability: KDB.AI's vector database can handle large-scale data and search operations efficiently.
- Flexibility: you can experiment with different embedding models and configurations to optimize performance for your specific use cases.
A common approach is to use a Python framework like sentence-transformers, developed by Hugging Face, which offers state-of-the-art sentence, text, and image embeddings. Here's a typical workflow:
- Embed your dataset locally: use a library like FastEmbed (built on top of Hugging Face's transformers library, optimized for speed) to embed your dataset, which might consist of AI tools and associated metadata.
- Embed queries at inference time: when a user submits a query, use an external service like Hugging Face's Inference API to embed the query. This eliminates the need to deploy your own model, allowing you to leverage a fully optimized external service.
By following this approach, you can build a system that searches through hundreds of AI tools without the need to deploy any infrastructure (and scale to millions!). Additionally, since you embed the dataset locally, you can use Hugging Face's free plan without requiring a credit card or worrying about hitting rate limits, at least until you are ready for production.
Getting started
Before you integrate KDB.AI with Hugging Face, you need to have the following:
- Python 3 (versions 3.8 to 3.11), Pip, and Git installed
- active KDB.AI Cloud or Server license
- valid API key for KDB.AI Cloud
- know how to work with vector databases and embedding models
- understand how to setup the necessary configurations for interacting with either KDB.AI Cloud or Server
Setup
To complete the KDB.AI integration, you can use the Hugging Face Inference API endpoints to generate high-quality embeddings and store/index them in the vector database.
- Sign Up/Sign In to Hugging Face and verify your account.
- Log in and go to Avatar -> Settings.
- Select Access Tokens -> click New token
- Give the token name and
Write
type. - Click Generate token.
- Copy the generated token and use in the script below as
HF_TOKEN
- Install dependencies:
!pip install kdbai_client fastembed
- Import packages:
# vector DB import os from getpass import getpass import kdbai_client as kdbai import time import numpy as np import pandas as pd
1. Connect to KDB.AI
KDB.AI Cloud is for experimenting with smaller generative AI projects with a vector database in our cloud.
To use KDB.AI Cloud, you need to two session details - a URL endpoint and an API key, both available in your KDB.AI Cloud portal.
KDBAI_ENDPOINT = (
os.environ["KDBAI_ENDPOINT"]
if "KDBAI_ENDPOINT" in os.environ
else input("KDB.AI endpoint: ")
)
KDBAI_API_KEY = (
os.environ["KDBAI_API_KEY"]
if "KDBAI_API_KEY" in os.environ
else getpass("KDB.AI API key: ")
)
# Insert your Hugging Face token
HF_TOKEN = (
os.environ["HF_TOKEN"]
if "HF_TOKEN" in os.environ
else getpass("Hugging Face token: ")
)
# Define your KDB.AI session
session = kdbai.Session(endpoint=KDBAI_ENDPOINT, api_key=KDBAI_API_KEY)
KDB.AI Server is for evaluating large scale generative AI applications on premises or on your own cloud provider.
To use KDB.AI Server, you need download and run your own container. Follow the instructions in the signup email to get your session up and running, then passing your local endpoint:
session = kdbai.Session(endpoint="http://localhost:8082")
Verify defined tables - you can check your connection using the session.list()
function. This returns a list of all the tables you have defined in your vector database thus far:
# ensure no table called "ai_tools" exists
try:
session.table("ai_tools").drop()
time.sleep(5)
except kdbai.KDBAIException:
pass
session.list()
2. Create table
To create a table in KDB.AI, use the create_table
function, which takes two arguments: name
and schema
.
This schema must meet the following criteria:
- Must contain a list of columns.
- All columns must have a
pytype
specified, except the vectors column. - One column of vector embeddings may also have a
vectorIndex
attribute with the configuration of the index for similarity search. This column is implicitly an array offloat32
.
Define schema - to create a table with two columns, you can use, for example, the following columns:
- id with a list of dummy IDs
- vector embeddings to use for similarity search later on.
Next, you need to define dimensionality, similarity metric, and index type with the vectorIndex
attribute. You can use:
dims = 384
: In the next section, you generate embeddings that are eight-dimensional to match this. You can chose any value here.metric = L2
: Stands for L2/Euclidean distance. You can also use IP/Inner Product and CS/Cosine Similarity, depending on the specific context and nature of your data.type = flat
: We use a Flat index, but you can go for HNSW and IVFPQ, depending on the data and your performance requirements.
schema = {
"columns": [
{"name": "id", "pytype": "str"},
{"name": "name", "pytype": "str"},
{"name": "description", "pytype": "str"},
{"name": "summary", "pytype": "str"},
{"name": "title", "pytype": "str"},
{"name": "visitors", "pytype": "int32"},
{"name": "description_embedding", "vectorIndex": {"dims": 384, "metric": "L2", "type": "flat"}},
]
}
table = session.create_table("ai_tools", schema)
3. Add data to table
First, generate a vector of five 8-dimensional vectors - they'll be the vector embeddings. Next, add to pandas dataframe with column names/types matching the target table:
import requests
gist_url = "https://gist.github.com/mrmps/2f62a2287cb2c1ca63a2762fcaac89bc/raw"
response = requests.get(gist_url)
ai_tools_data = response.json()
df = pd.DataFrame.from_dict(ai_tools_data)
df.drop(columns=["xata"], inplace=True)
df.head()
from fastembed import TextEmbedding
embedding_model = TextEmbedding()
descriptions = [tool["description"] for tool in ai_tools_data]
embeddings = list(embedding_model.embed(descriptions))
# Create a DataFrame with the AI tools data
data = pd.DataFrame(ai_tools_data)[["id", "name", "description", "summary", "title", "visitors"]]
data["description_embedding"] = embeddings
# Bulk insert the data into KDB.AI
table.insert(data)
4. Search with Hugging Face
Use the Hugging Face Inference API to embed the query so that you can use it to search your index:
# Perform a similarity search using Hugging Face embeddings
import requests
# Make sure your URL looks like this to ensure you get instant results, and not a model loading error
embedding_url = "https://api-inference.huggingface.co/pipeline/feature-extraction/BAAI/bge-small-en-v1.5"
def waitForResourceAvailable(response, timeout_seconds):
timer = 0
while response.status_code == 204:
time.sleep(10)
timer += 10
if timer > timeout_seconds:
break
if response.status_code == 200:
break
def generate_query_embedding(text: str) -> list[float]:
response = requests.post(
embedding_url,
headers={"Authorization": f"Bearer {HF_TOKEN}"},
json={"inputs": text}
)
waitForResourceAvailable(response, 5)
if response.status_code != 200:
raise ValueError(f"Request failed with status code {response.status_code}: {response.text}")
print(response.status_code)
return response.json()
# Sometimes you might get a status code 503 (Unavailable server error response code indicates that the server is not ready to handle the request)
query = "AI tool for creating 3D textures"
query_embedding = generate_query_embedding(query)
results = table.search(
vectors=[query_embedding],
n=3,
)
table.drop()
Example #1: Use KDB.AI and Hugging Face for Transfer Learning
In this Image Search on Brain MRI Scans example, we take a model that has been pre-trained for a task (ResNet-50 for ImageNet classification) and use it as a starting point to solve a more specific problem. To create our image embeddings, we used a neural network that has been pre-trained on the brain tumor classification problem.
Example #2: Use a LLM model from Hugging Face to execute RAG
This Retrieval Augmented Generation (RAG) with LangChain notebook demonstrates how to use an advanced prompt engineering technique called Retrieval Augmented Generation (RAG), with hands-on examples using Langchain, KDB.AI and various LLMs.
Example #3: Use Hugging Face with KDB.AI to create a AI tool search engine
This notebook walks you through the process of embedding a dataset of AI tools using FastEmbed (a lightweight, fast, Python library built for embedding generation,) storing the embeddings in a KDB.AI table, and then using Hugging Face's Inference API to embed queries at inference time. This enables efficient and scalable similarity search capabilities.
Summary
Whether you're building a semantic search engine, a recommendation system, or any application that relies on finding similar items, the KDB.AI integration with Hugging Face provides a powerful and flexible solution.
Now that you have successfully configured the integration you can achieve the following:
- Enjoy a seamless connection between Hugging Face and KDB.AI.
- Develop Machine Learning (ML) applications using Hugging Face models by following the pre-built integration notebooks between KDB.AI and Hugging Face: Image search sample, RAG with LangChain, and AI tool search engine.
Next steps
- Head to our GitHub repository for more examples.
- Use Google Colab to run our notebooks: Image Search on Brain MRI Scans and Retrieval Augmented Generation (RAG) with LangChain