Inverted File Product Quantization (IVFPQ)

This page describes the Inverted File Product Quantization (IVFPQ) parameters as part of AI libs.

Inverted File Product Quantization (IVFPQ) combines the benefits of IVF and PQ. PQ is a technique for dimensionality reduction without losing essential information. Initially, IVF narrows down the search scope by using an inverted file index. Next, PQ further compresses the vectors, allowing efficient search with a reduced number of vectors.

.ai.pq.flat.search

The.ai.pq.flat.search` function performs a flat search over vectors using PQ (Product Quantization) encodings.

By searching compressed representations, it reduces both storage and computation costs while maintaining approximate similarity results. It is well-suited for large-scale datasets where exact search would be too costly.

Parameters

Name	Type(s)	Description
`repPts`	(real[][])[]	The centroid centers from `.ai.pq.train`
`encodings`	long[][]	The PQ encodings from `.ai.pq.predict`
`q`	real[] \| real[][]	The query vector(s)
`k`	short \| int \| long	The number of nearest neighbors to return
`metric`	symbol	The metric for centroid calculation, one of (`L2`, `CS`, `IP`)

Returns

Type	Description
(real; long)[]	The nearest points and the corresponding distance under the metric

Refer also to .ai.pq.predict, .ai.pq.train

Example

q).ai:use`kx.ai
q)vecs:{(x;y)#(x*y)?1e}[1000;10];
q)repPts:.ai.pq.train[2;8;vecs;`L2];
q)encodings:.ai.pq.predict[repPts;vecs;`L2];
q).ai.pq.flat.search[repPts;encodings;10?1e;5;`L2]
0.2806892 0.3283699 0.3645755 0.3939781 0.3949415
404       449       138       760       50

This example trains a PQ model with 2 splits and 8 clusters, encodes vectors, and then runs a flat PQ search for 5 nearest neighbors to a random query batch. The distances and IDs returned show approximate results using compressed representations. It illustrates how .ai.pq.flat.search provides efficient similarity search on PQ-encoded data.

.ai.pq.ivf.del

The .ai.pq.ivf.del function deletes points from an existing IVFPQ (Inverted File with Product Quantization) index.

Removing outdated or irrelevant vectors helps maintain both accuracy and efficiency of the index over time. It supports incremental index maintenance without the need for a full rebuild.

Parameters

Name	Type(s)	Description
`ivfpq`	dict	The existing IVFPQ index to delete from
`ids`	long \| long[]	The IDs of vectors to delete

Returns

Type	Description
dict	The IVFPQ index with points deleted

Refer also to .ai.ivf.train

Example

q).ai:use`kx.ai
q)vecs:{(x;y)#(x*y)?1e}[1000;10];
q)repPts:.ai.ivf.train[4;vecs;`L2];
q)ivf:.ai.ivf.put[();repPts;vecs;`L2];
q)ivfpq:.ai.ivf.topq[ivf;2;8;`L2;500]
clusters   | `s#0 1 2 3
ids        | (0 2 5 8 9 13 14 17 21 29 31 51 52..
centroids  | (0.5303572 0.5351732 0.5627522 0.4..
metric     | `L2
pqCentroids| ((0.2359393 -0.3364881 -0.01835147..
encodings  | ((20 244 120 21 21 140 249 144 228..
q)ivfpq:.ai.pq.ivf.del[ivfpq;0 2]
clusters   | `s#0 1 2 3
ids        | (5 8 9 13 14 17 21 29 31 51 52 53 ..
centroids  | (0.5303572 0.5351732 0.5627522 0.4..
metric     | `L2
pqCentroids| ((0.2359393 -0.3364881 -0.01835147..
encodings  | ((120 21 21 140 249 144 228 32 242..

The example builds an IVFPQ index from training centroids, inserts vectors, and then deletes entries with IDs 0 and 2. After deletion, those IDs are removed from the index while the cluster structure, centroids, and encodings remain intact. This shows how .ai.pq.ivf.del maintains index accuracy by cleaning out outdated vectors without a full rebuild.

.ai.pq.ivf.put

The .ai.pq.ivf.put function inserts new vectors into an IVFPQ index.

Each vector is assigned to a cluster and compressed using PQ encoding during insertion, balancing storage efficiency with search accuracy. It allows the index to grow dynamically while retaining high performance.

Parameters

Name	Type(s)	Description
`ivfpq`	dict	The existing IVFPQ index to upsert
`vecs`	real[][]	The vectors to insert

Returns

Type	Description
dict	The IVFPQ index

Refer also to .ai.ivf.train

Example

q).ai:use`kx.ai
q)vecs:{(x;y)#(x*y)?1e}[1000;10];
q)repPts:.ai.ivf.train[4;vecs;`L2];
q)ivf:.ai.ivf.put[();repPts;vecs;`L2];
q)ivfpq:.ai.ivf.topq[ivf;2;8;`L2;500];
q).ai.pq.ivf.put[ivfpq;vecs]
clusters   | `s#0 1 2 3
ids        | (0 2 5 8 9 13 14 17 21 29 31 51 52..
centroids  | (0.5303572 0.5351732 0.5627522 0.4..
metric     | `L2
pqCentroids| ((0.2359393 -0.3364881 -0.01835147..
encodings  | ((20 244 120 21 21 140 249 144 228..

After training and building an IVF index, this example converts it into an IVFPQ index and then inserts additional vectors with .ai.pq.ivf.put. The vectors are assigned to clusters and stored as PQ encodings, balancing compression with search accuracy. This demonstrates how new data can be dynamically added to an IVFPQ index.

.ai.pq.ivf.upd

The .ai.pq.ivf.upd function updates existing vectors within an IVFPQ index.

Updates ensure that the index reflects the latest representations of data without requiring explicit deletion and reinsertion. It is useful for datasets where vector embeddings evolve over time.

Parameters

Name	Type(s)	Description
`ivfpq`	dict	The existing IVFPQ index to update
`ids`	long \| long[]	The ids of vectors to update
`vecs`	real[][]	The replacement vectors

Returns

Type	Description
dict	The IVFPQ index with updated points

Refer also to .ai.ivf.train

Example

q).ai:use`kx.ai
q)vecs:{(x;y)#(x*y)?1e}[1000;10];
q)repPts:.ai.ivf.train[4;vecs;`L2];
q)ivf:.ai.ivf.put[();repPts;vecs;`L2];
q)ivfpq:.ai.ivf.topq[ivf;2;8;`L2;500]
clusters   | `s#0 1 2 3
ids        | (0 2 5 8 9 13 14 17 21 29 31 51 52..
centroids  | (0.5303572 0.5351732 0.5627522 0.4..
metric     | `L2
pqCentroids| ((0.2359393 -0.3364881 -0.01835147..
encodings  | ((20 244 120 21 21 140 249 144 228..
q)ivfpq:.ai.pq.ivf.upd[ivfpq;2 5;2#enlist (first vecs)]
clusters   | `s#0 1 2 3
ids        | (0 8 9 13 14 17 21 29 31 51 52 53 ..
centroids  | (0.5303572 0.5351732 0.5627522 0.4..
metric     | `L2
pqCentroids| ((0.2359393 -0.3364881 -0.01835147..
encodings  | ((20 21 21 140 249 144 228 32 242 ..
q).ai.pq.ivf.search[ivfpq;first vecs;3;1]
0.05670813 0.05670813 0.05670813
0          5          2

Here, an IVFPQ index is built and then updated at IDs 2 and 5 with new vector values. The updated encodings are stored in place, and a follow-up search shows the modified vectors ranking highly against the query. This demonstrates how .ai.pq.ivf.upd keeps the index synchronized with evolving data.

.ai.pq.ivf.search

The .ai.pq.ivf.search function conducts a search against an IVFPQ index, combining inverted file indexing with PQ encoding.

It first narrows down the search to relevant clusters and then compares PQ-encoded vectors, achieving fast and memory-efficient retrieval. It is a core operation for large-scale approximate nearest-neighbor search.

Parameters

Name	Type(s)	Description
`ivfpq`	dict	The existing IVFPQ index to search
`q`	real[] \| real[][]	The query vector(s)
`k`	short \| int \| long	The number of nearest neighbors
`nprobe`	short \| int \| long	The number of clusters to search

Returns

Type	Description
(real; long)[]	The nearest points and the corresponding distance under the metric

Refer also to .ai.ivf.topq

Example

q).ai:use`kx.ai
q)vecs:{(x;y)#(x*y)?1e}[1000;10];
q)repPts:.ai.ivf.train[4;vecs;`L2];
q)ivf:.ai.ivf.put[();repPts;vecs;`L2];
q)ivfpq:.ai.ivf.topq[ivf;2;8;`L2;500];
q).ai.pq.ivf.search[ivfpq;10?1e;5;2]
0.2209095 0.3278724 0.3932657 0.4022476 0.4246504
122       914       146       336       874

The example builds an IVFPQ index and performs a search for 5 nearest neighbors, searching across 2 clusters. The output shows the approximate distances and IDs of the top matches. This demonstrates how .ai.pq.ivf.search combines cluster-based narrowing with PQ encodings to achieve fast and memory-efficient retrieval.

.ai.pq.predict

The .ai.pq.predict function determines the closest cluster assignment for each vector across the defined number of splits.

By predicting the cluster placement, it enables efficient indexing and search operations. It is an essential step for both training and insertion in PQ-based workflows.

Parameters

Name	Type(s)	Description
`repPts`	(real[][])[]	The centroid centers from `.ai.pq.train`
`vecs`	real[][]	The vectors to calculate nearest centroids of across `nsplits`
`metric`	symbol	The metric for distance calculation, one of (`L2`, `CS`, `IP`)

Returns

Type	Description
long[][]	The prediction per cluster for PQ LUTs (Look Up Tables)

Refer also to .ai.pq.train

Example

 vecs:{(x;y)#(x*y)?1e}[1000;10];
 repPts:.ai.pq.train[2;8;vecs;`L2]
 .ai.pq.predict[repPts;vecs;`L2]
 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ..
 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ..

Here, a PQ model is trained and then used to predict cluster assignments for every vector in the dataset. The output maps each vector to a cluster in each split, producing compact encodings. This demonstrates how .ai.pq.predict is a critical preprocessing step for PQ-based indexing and search.

.ai.pq.train

The .ai.pq.train function calculates cluster centroids for each split in a PQ model.

Training defines the quantization partitions used to compress vectors into compact codes. The resulting centroids directly impact search quality, making this a critical step in building accurate and efficient PQ-based indexes.

Parameters

Name	Type(s)	Description
`nsplits`	long \| int	The number of columnular splits on the matrices
`nbits`	long \| int	The number of bits used to encode each PQ subvector
`vecs`	real[][]	The training vectors
`metric`	symbol	The metric for centroid calculation, one of (`L2`, `CS`, `IP`)

Returns

Type	Description
(real[][])[]	The vectors representing the centroid center per split

Example

q).ai:use`kx.ai
q)vecs:{(x;y)#(x*y)?1e}[1000;10];
q).ai.pq.train[2;8;vecs;`L2]
0.2759243    0.6436368   0.5493995   0.3526518 ..
0.2410773   0.7541564   0.53973    0.7833696  0..

This example trains a PQ model with 2 splits and 8 quantization levels on a dataset of vectors. The output lists the learned centroids for each partition. It demonstrates how .ai.pq.train defines the quantization scheme that enables vectors to be compressed into PQ codes for efficient storage and retrieval.