cuVS CAGRA reference card
This page covers the KDB-X cuVS CAGRA module APIs, including inputs, outputs, and examples for each function.
Overview
The cuVS module exposes a cagra namespace with the following functions:
| Function | Description |
|---|---|
.cuvs.cagra.init |
Create a new CAGRA index |
cagra.insert |
Build or extend the index with vectors |
cagra.count |
Count vectors in the index |
cagra.search |
Nearest-neighbor search |
cagra.filter |
Nearest-neighbor search with a boolean mask |
cagra.write |
Serialize the index to disk |
cagra.read |
Deserialize an index from disk |
cagra.normalize |
L2-normalize a list of vectors |
Load the module
.cuvs:use`kx.cuvs
Note
The cuVS CAGRA module operates on GPU-resident indexes represented as foreign objects returned by .cuvs.cagra.init or cagra.read.
.cuvs.cagra.init
Create a new CAGRA index
.cuvs.cagra.init params
Where params is a dictionary of index parameters (or :: for all defaults), returns a CAGRA index foreign object.
The index is empty after creation – use cagra.insert to build it with vectors.
Note
Indexes created with .cuvs.cagra.init are in-memory and are not persisted automatically. They are lost when the process exits unless explicitly saved using cagra.write.
Index parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
gpuid |
long | 0 |
GPU device ID to use |
dims |
long | – | Dimensionality of the vectors. Must match the vectors passed to cagra.insert. |
metric |
symbol | `L2 |
Distance metric. One of `L2, `CS, `IP. |
graph_degree |
long | 64 |
Edges per node in the final graph. Core recall/memory trade-off. |
intermediate_graph_degree |
long | 128 |
Graph degree before pruning. Must be ≥ graph_degree. |
build_algo |
symbol | `AUTO_SELECT |
Graph build algorithm. One of `AUTO_SELECT, `IVF_PQ, `nn_descent. |
nn_descent_niter |
long | 20 |
Number of iterations for nn_descent build. Ignored for other algorithms. |
Warning
compression and graph_build_params are not currently available – passing either will return a 'nyi error.
.cuvs:use`kx.cuvs
/ Create with all defaults
idx:.cuvs.cagra.init[::]
/ Create with explicit params
idx:.cuvs.cagra.init[`gpuid`dims`metric`graph_degree`intermediate_graph_degree`build_algo`nn_descent_niter!(0;128;`L2;64;128;`IVF_PQ;20)]
/ Cosine similarity – vectors are L2-normalized internally at insert and search time
idx:.cuvs.cagra.init[`gpuid`dims`metric!(0;128;`CS)]
cagra.insert
Build or extend the index with vectors
cagra.insert[index;vectors]
Where:
indexis a CAGRA index foreign object returned by.cuvs.cagra.initorcagra.readvectorsis a list of float (8h) vectors, each of lengthdims
Returns the count of vectors inserted as a long.
On the first call, builds the full CAGRA graph. On subsequent calls, extends the existing graph incrementally.
Minimum dataset size
CAGRA requires at least intermediate_graph_degree + 1 vectors before the index can be built (default minimum: 129). Inserting fewer vectors on the first call will cause a GPU memory fault and corrupt the CUDA context – all subsequent GPU operations will fail until the process is restarted. Accumulate a sufficient batch before the first insert, or use a lower intermediate_graph_degree.
Note
A minimum of 2 vectors is required on the first insert. The q wrapper enforces this with a 'Cagra index requires at least 2 vectors error.
dims:128
idx:.cuvs.cagra.init[`gpuid`dims`metric`build_algo!(0;dims;`L2;`IVF_PQ)]
/ Generate random float vectors
N:10000
vecs:{(x;y)#(x*y)?1e};
data:vecs[N;dims];
testVecs:data[answer:neg[nTest]?nTrain];
/ Build the index
cagra.insert[idx;data]
10000
/ Extend with additional vectors
data2:vecs[1000;dims];
cagra.insert[idx;data2]
1000
cagra.count
Count vectors in the index
cagra.count index
Where index is a CAGRA index foreign object, returns the number of vectors currently in the index as a long. Returns 0 for an empty (unbuilt) index.
idx:.cuvs.cagra.init[`gpuid`dims!(0;128)]
.cuvs.cagra.count idx
0
vecs:{(x;y)#(x*y)?1e};
data:vecs[10000;128];
.cuvs.cagra.insert[idx;data]
10000
.cuvs.cagra.count idx
10000
cagra.search
Nearest-neighbor search
cagra.search[index;query;k;params]
Where:
indexis a CAGRA index foreign objectqueryis a single float vector or a list of float vectors of lengthdimskis a long – the number of nearest neighbors to returnparamsis a dictionary of search parameters (or::for all defaults)
Returns a table with columns distances and neighbors (0-based row indices into the index). For a single query vector, returns a single flattened result. For a batch, returns a list of results.
Note
k must not exceed itopk_size (default 64). Passing k > itopk_size returns a 'value error. To retrieve more than 64 neighbors, set itopk_size accordingly in params.
Warning
Searching an empty (unbuilt) index returns a 'empty error.
Search parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
max_queries |
long | 0 |
Max concurrent queries (batch size). 0 = auto-select. |
itopk_size |
long | 64 |
Internal candidate list size. Primary recall/speed trade-off. Max 512 for SINGLE_CTA. |
max_iterations |
long | 0 |
Upper limit on search iterations. 0 = auto. |
algo |
symbol | `SINGLE_CTA |
Search parallelism strategy. One of `SINGLE_CTA, `MULTI_CTA, `MULTI_KERNEL, `AUTO. |
team_size |
long | 0 |
CUDA threads per distance calculation. 0 = auto. Valid values: 4, 8, 16, 32. |
search_width |
long | 1 |
Graph nodes explored in parallel per iteration. |
min_iterations |
long | 0 |
Minimum search iterations to perform. |
thread_block_size |
long | 0 |
CUDA thread block size. 0 = auto. |
hashmap_mode |
symbol | `HASH |
Hashmap allocation strategy. One of `HASH, `SMALL, `AUTO_HASH. |
hashmap_min_bitlen |
long | 0 |
Minimum bit length for hashmap. |
hashmap_max_fill_rate |
float | 0.5 |
Maximum fill rate before hashmap rehash. |
num_random_samplings |
long | 1 |
Number of random seed samples for graph traversal start points. |
rand_xor_mask |
long | 0x128394 |
XOR mask for random seed generation. |
persistent |
boolean | 0b |
Enable persistent kernel mode. |
persistent_lifetime |
float | 2.0 |
Lifetime (seconds) for persistent kernel. |
persistent_device_usage |
float | 1.0 |
Fraction of GPU to dedicate to persistent kernel. |
Note
SINGLE_CTA is the default for small workloads, but AUTO is recommended for general use and larger batch sizes.
dims:128;
vecs:{(x;y)#(x*y)?1e};
data:vecs[10000;dims];
idx:.cuvs.cagra.init[`gpuid`dims`metric`build_algo!(0;dims;`L2;`IVF_PQ)]
.cuvs.cagra.insert[idx;data]
10000
/ Single query – returns a flat table
q:dims?1e
.cuvs.cagra.search[idx;q;10;::]
distances neighbors
--------------------
0.1234 42
0.2341 817
...
/ Batch query – returns a list of tables
qs:vecs[5;dims];
.cuvs.cagra.search[idx;qs;10;::]
/ Custom search params – higher recall
params:`itopk_size`algo!(128;`MULTI_CTA);
.cuvs.cagra.search[idx;q;10;params]
/ Full search params
params:`max_queries`itopk_size`max_iterations`algo`team_size`search_width`min_iterations`thread_block_size`hashmap_mode`hashmap_min_bitlen`hashmap_max_fill_rate`num_random_samplings!(0;64;0;`SINGLE_CTA;0;1;0;0;`HASH;0;0.5;1);
.cuvs.cagra.search[idx;q;10;params]
cagra.filter
Nearest-neighbor search with a boolean mask
cagra.filter[index;query;k;params;ids]
Where:
indexis a CAGRA index foreign objectqueryis a single float vector or a list of float vectors of lengthdimskis a long – the number of nearest neighbors to returnparamsis a dictionary of search parameters (or::for all defaults) – same parameters ascagra.searchidsis a list of IDs to include of typelong.
Returns the same structure as cagra.search, restricted to vectors that match the ids. Results with negative distances are filtered out automatically.
Note
ids must be a subset of the input vectors.
dims:10;
N:10000;
vecs:{(x;y)#(x*y)?1e};
data:vecs[N;dims];
idx:.cuvs.cagra.init[`gpuid`dims`metric`build_algo!(0;dims;`L2;`IVF_PQ)];
.cuvs.cagra.insert[idx;data];
10000
/ Allow only the first 5000 vectors
ids:til 5000;
q:dims?1e
.cuvs.cagra.filter[idx;q;10;::;ids]
distances neighbors
--------------------
0.1891 312
0.2104 1204
...
/ All results will have neighbors < 5000
cagra.write
Serialize the index to disk
cagra.write[index;path]
Where:
indexis a CAGRA index foreign objectpathis a symbol or string file path (without extension)
Writes two files to disk: <path>.cagra (the binary index) and <path>.kdb (index metadata including build parameters and GPU ID).
Note
Both files are required to reload the index with cagra.read. Do not delete or rename one without the other.
.cuvs.cagra.write[idx;`:myindex]
/ Writes: myindex.cagra and myindex.kdb
.cuvs.cagra.write[idx;"myindex"]
/ Equivalent using a string path
cagra.read
Deserialize an index from disk
cagra.read[path;gpuid]
Where:
pathis a symbol or string file path used when saving (without extension)gpuidis a long specifying the GPU device to load onto, or::to use the GPU ID stored in the index metadata
Returns a CAGRA index foreign object ready for search.
Note
Both <path>.cagra and <path>.kdb must exist. If either is missing, a 'os error is returned.
Warning
Specifying a gpuid that does not exist on the current system returns an error. If the stored GPU ID is no longer valid (e.g. loading on a different machine), pass an explicit gpuid override.
/ Load onto the GPU stored in metadata
idx:.cuvs.cagra.read[`:myindex;::]
/ Load onto a specific GPU
idx:.cuvs.cagra.read[`:myindex;0]
/ Use immediately after loading
.cuvs.cagra.count idx
10000
.cuvs.cagra.search[idx;dims?1e;10;::]
cagra.normalize
L2-normalize a list of vectors
cagra.normalize x
Where x is a list of numeric vectors (types 5h–9h), returns a list of real (8h) vectors each L2-normalized to unit length.
Note
When metric:``CS`` (Cosine Similarity) is passed to.cuvs.cagra.init, vectors are normalized automatically at insert and search time. Usecagra.normalizeexplicitly only when you need to pre-normalize vectors for other purposes or when usingmetric:IP.
vecs:(1 0 0e; 0 1 0e; 1 1 0e)
.cuvs.cagra.normalize vecs
1 0 0e
0 1 0e
0.7071068 0.7071068 0e
/ Each result vector has unit length
sqrt sum each {x*x} each .cuvs.cagra.normalize vecs
1 1 1e
Next steps
- Explore the cuVS Examples page for end-to-end usage including index tuning, VRAM planning, and search performance.
- Visit Troubleshooting Errors if you encounter issues.