Send Feedback
Skip to content

cuVS CAGRA reference card

This page covers the KDB-X cuVS CAGRA module APIs, including inputs, outputs, and examples for each function.

Overview

The cuVS module exposes a cagra namespace with the following functions:

Function Description
.cuvs.cagra.init Create a new CAGRA index
cagra.insert Build or extend the index with vectors
cagra.count Count vectors in the index
cagra.search Nearest-neighbor search
cagra.filter Nearest-neighbor search with a boolean mask
cagra.write Serialize the index to disk
cagra.read Deserialize an index from disk
cagra.normalize L2-normalize a list of vectors

Load the module

.cuvs:use`kx.cuvs

Note

The cuVS CAGRA module operates on GPU-resident indexes represented as foreign objects returned by .cuvs.cagra.init or cagra.read.


.cuvs.cagra.init

Create a new CAGRA index

.cuvs.cagra.init params

Where params is a dictionary of index parameters (or :: for all defaults), returns a CAGRA index foreign object.

The index is empty after creation – use cagra.insert to build it with vectors.

Note

Indexes created with .cuvs.cagra.init are in-memory and are not persisted automatically. They are lost when the process exits unless explicitly saved using cagra.write.

Index parameters

Parameter Type Default Description
gpuid long 0 GPU device ID to use
dims long Dimensionality of the vectors. Must match the vectors passed to cagra.insert.
metric symbol `L2 Distance metric. One of `L2, `CS, `IP.
graph_degree long 64 Edges per node in the final graph. Core recall/memory trade-off.
intermediate_graph_degree long 128 Graph degree before pruning. Must be ≥ graph_degree.
build_algo symbol `AUTO_SELECT Graph build algorithm. One of `AUTO_SELECT, `IVF_PQ, `nn_descent.
nn_descent_niter long 20 Number of iterations for nn_descent build. Ignored for other algorithms.

Warning

compression and graph_build_params are not currently available – passing either will return a 'nyi error.

.cuvs:use`kx.cuvs

/ Create with all defaults
idx:.cuvs.cagra.init[::]

/ Create with explicit params
idx:.cuvs.cagra.init[`gpuid`dims`metric`graph_degree`intermediate_graph_degree`build_algo`nn_descent_niter!(0;128;`L2;64;128;`IVF_PQ;20)]

/ Cosine similarity – vectors are L2-normalized internally at insert and search time
idx:.cuvs.cagra.init[`gpuid`dims`metric!(0;128;`CS)]

cagra.insert

Build or extend the index with vectors

cagra.insert[index;vectors]

Where:

  • index is a CAGRA index foreign object returned by .cuvs.cagra.init or cagra.read
  • vectors is a list of float (8h) vectors, each of length dims

Returns the count of vectors inserted as a long.

On the first call, builds the full CAGRA graph. On subsequent calls, extends the existing graph incrementally.

Minimum dataset size

CAGRA requires at least intermediate_graph_degree + 1 vectors before the index can be built (default minimum: 129). Inserting fewer vectors on the first call will cause a GPU memory fault and corrupt the CUDA context – all subsequent GPU operations will fail until the process is restarted. Accumulate a sufficient batch before the first insert, or use a lower intermediate_graph_degree.

Note

A minimum of 2 vectors is required on the first insert. The q wrapper enforces this with a 'Cagra index requires at least 2 vectors error.

dims:128
idx:.cuvs.cagra.init[`gpuid`dims`metric`build_algo!(0;dims;`L2;`IVF_PQ)]

/ Generate random float vectors
N:10000
vecs:{(x;y)#(x*y)?1e};
data:vecs[N;dims];
testVecs:data[answer:neg[nTest]?nTrain];

/ Build the index
cagra.insert[idx;data]
10000

/ Extend with additional vectors
data2:vecs[1000;dims];
cagra.insert[idx;data2]
1000

cagra.count

Count vectors in the index

cagra.count index

Where index is a CAGRA index foreign object, returns the number of vectors currently in the index as a long. Returns 0 for an empty (unbuilt) index.

idx:.cuvs.cagra.init[`gpuid`dims!(0;128)]
.cuvs.cagra.count idx
0

vecs:{(x;y)#(x*y)?1e};
data:vecs[10000;128];
.cuvs.cagra.insert[idx;data]
10000
.cuvs.cagra.count idx
10000

cagra.search

Nearest-neighbor search

cagra.search[index;query;k;params]

Where:

  • index is a CAGRA index foreign object
  • query is a single float vector or a list of float vectors of length dims
  • k is a long – the number of nearest neighbors to return
  • params is a dictionary of search parameters (or :: for all defaults)

Returns a table with columns distances and neighbors (0-based row indices into the index). For a single query vector, returns a single flattened result. For a batch, returns a list of results.

Note

k must not exceed itopk_size (default 64). Passing k > itopk_size returns a 'value error. To retrieve more than 64 neighbors, set itopk_size accordingly in params.

Warning

Searching an empty (unbuilt) index returns a 'empty error.

Search parameters

Parameter Type Default Description
max_queries long 0 Max concurrent queries (batch size). 0 = auto-select.
itopk_size long 64 Internal candidate list size. Primary recall/speed trade-off. Max 512 for SINGLE_CTA.
max_iterations long 0 Upper limit on search iterations. 0 = auto.
algo symbol `SINGLE_CTA Search parallelism strategy. One of `SINGLE_CTA, `MULTI_CTA, `MULTI_KERNEL, `AUTO.
team_size long 0 CUDA threads per distance calculation. 0 = auto. Valid values: 4, 8, 16, 32.
search_width long 1 Graph nodes explored in parallel per iteration.
min_iterations long 0 Minimum search iterations to perform.
thread_block_size long 0 CUDA thread block size. 0 = auto.
hashmap_mode symbol `HASH Hashmap allocation strategy. One of `HASH, `SMALL, `AUTO_HASH.
hashmap_min_bitlen long 0 Minimum bit length for hashmap.
hashmap_max_fill_rate float 0.5 Maximum fill rate before hashmap rehash.
num_random_samplings long 1 Number of random seed samples for graph traversal start points.
rand_xor_mask long 0x128394 XOR mask for random seed generation.
persistent boolean 0b Enable persistent kernel mode.
persistent_lifetime float 2.0 Lifetime (seconds) for persistent kernel.
persistent_device_usage float 1.0 Fraction of GPU to dedicate to persistent kernel.

Note

SINGLE_CTA is the default for small workloads, but AUTO is recommended for general use and larger batch sizes.

dims:128;
vecs:{(x;y)#(x*y)?1e};
data:vecs[10000;dims];
idx:.cuvs.cagra.init[`gpuid`dims`metric`build_algo!(0;dims;`L2;`IVF_PQ)]
.cuvs.cagra.insert[idx;data]
10000

/ Single query – returns a flat table
q:dims?1e
.cuvs.cagra.search[idx;q;10;::]
distances  neighbors
--------------------
0.1234     42
0.2341     817
...

/ Batch query – returns a list of tables
qs:vecs[5;dims];
.cuvs.cagra.search[idx;qs;10;::]

/ Custom search params – higher recall
params:`itopk_size`algo!(128;`MULTI_CTA);
.cuvs.cagra.search[idx;q;10;params]

/ Full search params
params:`max_queries`itopk_size`max_iterations`algo`team_size`search_width`min_iterations`thread_block_size`hashmap_mode`hashmap_min_bitlen`hashmap_max_fill_rate`num_random_samplings!(0;64;0;`SINGLE_CTA;0;1;0;0;`HASH;0;0.5;1);
.cuvs.cagra.search[idx;q;10;params]

cagra.filter

Nearest-neighbor search with a boolean mask

cagra.filter[index;query;k;params;ids]

Where:

  • index is a CAGRA index foreign object
  • query is a single float vector or a list of float vectors of length dims
  • k is a long – the number of nearest neighbors to return
  • params is a dictionary of search parameters (or :: for all defaults) – same parameters as cagra.search
  • ids is a list of IDs to include of type long.

Returns the same structure as cagra.search, restricted to vectors that match the ids. Results with negative distances are filtered out automatically.

Note

ids must be a subset of the input vectors.

dims:10;
N:10000;
vecs:{(x;y)#(x*y)?1e};
data:vecs[N;dims];
idx:.cuvs.cagra.init[`gpuid`dims`metric`build_algo!(0;dims;`L2;`IVF_PQ)];
.cuvs.cagra.insert[idx;data];
10000

/ Allow only the first 5000 vectors
ids:til 5000;
q:dims?1e
.cuvs.cagra.filter[idx;q;10;::;ids]
distances  neighbors
--------------------
0.1891     312
0.2104     1204
...

/ All results will have neighbors < 5000

cagra.write

Serialize the index to disk

cagra.write[index;path]

Where:

  • index is a CAGRA index foreign object
  • path is a symbol or string file path (without extension)

Writes two files to disk: <path>.cagra (the binary index) and <path>.kdb (index metadata including build parameters and GPU ID).

Note

Both files are required to reload the index with cagra.read. Do not delete or rename one without the other.

.cuvs.cagra.write[idx;`:myindex]
/ Writes: myindex.cagra and myindex.kdb

.cuvs.cagra.write[idx;"myindex"]
/ Equivalent using a string path

cagra.read

Deserialize an index from disk

cagra.read[path;gpuid]

Where:

  • path is a symbol or string file path used when saving (without extension)
  • gpuid is a long specifying the GPU device to load onto, or :: to use the GPU ID stored in the index metadata

Returns a CAGRA index foreign object ready for search.

Note

Both <path>.cagra and <path>.kdb must exist. If either is missing, a 'os error is returned.

Warning

Specifying a gpuid that does not exist on the current system returns an error. If the stored GPU ID is no longer valid (e.g. loading on a different machine), pass an explicit gpuid override.

/ Load onto the GPU stored in metadata
idx:.cuvs.cagra.read[`:myindex;::]

/ Load onto a specific GPU
idx:.cuvs.cagra.read[`:myindex;0]

/ Use immediately after loading
.cuvs.cagra.count idx
10000
.cuvs.cagra.search[idx;dims?1e;10;::]

cagra.normalize

L2-normalize a list of vectors

cagra.normalize x

Where x is a list of numeric vectors (types 5h9h), returns a list of real (8h) vectors each L2-normalized to unit length.

Note

When metric:``CS`` (Cosine Similarity) is passed to.cuvs.cagra.init, vectors are normalized automatically at insert and search time. Usecagra.normalizeexplicitly only when you need to pre-normalize vectors for other purposes or when usingmetric:IP.

vecs:(1 0 0e; 0 1 0e; 1 1 0e)
.cuvs.cagra.normalize vecs
1          0          0e
0          1          0e
0.7071068  0.7071068  0e

/ Each result vector has unit length
sqrt sum each {x*x} each .cuvs.cagra.normalize vecs
1 1 1e

Next steps