Hierarchical Navigable Small Worlds (HNSW)

This page describes the parameters for Hierarchical Navigable Small Worlds (HNSW) calls as part of AI libs.

Hierarchical Navigable Small Worlds (HNSW) is a graph-based indexing method that speeds up vector searches by organizing data into multiple layers. Higher layers contain long links for broad, fast navigation, while lower layers hold shorter links for more precise results. This structure allows efficient approximate nearest neighbor searches with logarithmic complexity, making HNSW ideal for large-scale databases where quick retrieval is more important than perfect accuracy.

.ai.hnsw.del

The .ai.hnsw.del function deletes one or more points at once from the index.

Deletion allows for maintaining an up-to-date index by removing outdated or irrelevant data without requiring a full rebuild. It ensures the index remains accurate and efficient as data evolves over time.

Parameters

Name	Type(s)	Description
`hnsw`	(dict[]; dict[])	The HNSW object
`embs`	real[][]	The embeddings to delete
`metric`	symbol	The metric, one of (`L2`; `CS`; `IP`)
`dps`	long \| long[]	The node ID or IDs of point(s) to be deleted

Returns

Type	Description
(dict[]; dict[])	Returns the HNSW object

Example

q).ai:use`kx.ai
q)vecs:{(x;y)#(x*y)?1e}[1000;10];
q)hnsw:.ai.hnsw.put[();();vecs;`L2;32;1%log 32;64];
q)hnsw:.ai.hnsw.del[hnsw;vecs;`L2;13];

The example builds an HNSW index from 1,000 randomly generated vectors and then deletes the vector at index 13. This shows how .ai.hnsw.del can remove individual entries from the index without requiring a full rebuild, keeping the index current as data changes.

.ai.hnsw.filterSearch

The .ai.hnsw.filterSearch API is used for searching existing HNSW index.

It returns the most relevant matches for a query but only from vectors that meet specified criteria. This targeted approach improves precision by narrowing results to a defined subset of the index.

Parameters

Name	Type(s)	Description
`embs`	real[][]	The vectors corresponding to the HNSW object
`hnsw`	(dict[]; dict[])	The HNSW object
`q`	real[] \| float[]	The query vector
`k`	short \| int \| long	The number of nearest neighbors to retrieve
`metric`	symbol	Metric index construction, one of (`L2`; `CS`; `IP`)
`efs`	short \| int \| long	The `efSearch` hyperparameter
`ids`	long[] \| int[]	The list of IDs to include

Returns

Type	Description
(real[]; long[])	The nearest points and the corresponding distance under the given metric

Example

q).ai:use`kx.ai
q)vecs:{(x;y)#(x*y)?1e}[1000;10];
q)hnsw:.ai.hnsw.put[();();vecs;`L2;32;1%log 32;64];
q).ai.hnsw.filterSearch[vecs;hnsw;first vecs;4;`L2;32;701 977 407 26]

0.2077437 0.210438 0.216104 0.2323341
701       977      407      26

This example queries the HNSW index with the first vector in the dataset but applies a filter restricting results to indices [701, 977, 407, 26]. The output shows the distances and matching indices within that subset. It demonstrates how .ai.hnsw.filterSearch enables targeted searches, returning only relevant neighbors from specified candidates.

.ai.hnsw.normalize

The .ai.hnsw.normalize function normalizes vectors.

Cosine similarity (CS) is mathematically equivalent to the inner product (IP) metric performed on normalized vectors. By normalizing your vectors before inserting them into hnsw or flat, and also normalizing incoming search vectors, you can use the inner product metric instead of cosine similarity, thus yielding identical results. This significantly reduces search and insert times, as it removes repeated normalization.

You can use this function, however, note that it is optimized for speed, not memory utilization. If you're converting large vector stores, it's best to do them in smaller chunks using the formula {8h$x%sqrt sum x*x}.

Parameters

Name	Type(s)	Description
`embs`	real[][]	The original un-normalized vector embeddings

Returns

Type	Description
real[][]	Returns normalized vector embeddings

Example

q).ai:use`kx.ai
q)vecs:{(x;y)#(x*y)?1e}[100000;10];
q)\ts hnsw1:.ai.hnsw.put[();();vecs;`CS;32;1%log 32;64];
3422 108232848
q)\ts:1000 res1:.ai.hnsw.search[vecs;hnsw1;first vecs;5;`CS;512]
326 2016
q)nvecs:.ai.hnsw.normalize vecs;
q)\ts hnsw2:.ai.hnsw.put[();();nvecs;`IP;32;1%log 32;64]
3172 108083968
q)\ts:1000 res2:.ai.hnsw.search[nvecs;hnsw2;first nvecs;5;`IP;512]
278 2016

q)res1[1]~res2[1]
1b

The example first creates and queries an HNSW index using cosine similarity (CS). It then normalizes the vectors with .ai.hnsw.normalize and rebuilds the index to use inner product (IP) instead. The results from both approaches are identical, but the normalized/IP workflow is faster, illustrating how normalization reduces repeated computation in large-scale searches.

.ai.hnsw.put

The .ai.hnsw.put function inserts a new vector into the HNSW index, leveraging secondary structures for efficient indexing.

The insertion process places the vector in the hierarchical graph, ensuring it can be quickly retrieved during similarity searches. It supports incremental index building as new data becomes available.

Parameters

Name	Type(s)	Description
`embs`	real[][]	The vectors corresponding to the HNSW object
`hnsw`	(dict[]; dict[])	The HNSW object
`vecs`	real[][]	Incoming vectors being inserted
`metric`	symbol	Metric index construction, one of (`L2`; `CS`; `IP`)
`M`	short \| int \| long	The `M hnsw` hyperparameter
`ml`	real \| float	The normalization factor for level generation (default: `1%log M`)
`ef`	short \| int \| long	The `ef Construction hnsw` hyperparameter

Returns

Type	Description
(dict[]; dict[])	The HNSW object

Example

q).ai:use`kx.ai
q)vecs:{(x;y)#(x*y)?1e}[1000;10];
q)hnsw:.ai.hnsw.put[();();vecs;`L2;32;1%log 32;64];

This example constructs a new HNSW index from a set of 1,000 random vectors using L2 distance. The index is initialized with parameters controlling graph connectivity and performance. It shows how .ai.hnsw.put is the entry point for building a searchable HNSW structure from raw data.

.ai.hnsw.search

The .ai.hnsw.search function executes a standard search against an existing HNSW index.

It retrieves the most relevant nearest neighbors for a given query vector, based on the underlying graph-based search algorithm. Designed for high performance, it enables scalable and low-latency vector similarity search.

Parameters

Name	Type(s)	Description
`embs`	real[][]	The vectors corresponding to the HNSW object
`hnsw`	(dict[]; dict[])	The HNSW object
`q`	real[] \| float[]	The query vector
`k`	short \| int \| long	The number of nearest neighbors to retrieve
`metric`	symbol	Metric index construction, one of (`L2`; `CS`; `IP`)
`efs`	short \| int \| long	The `efSearch` hyperparameter

Returns

Type	Description
(real[]; long[])	The nearest points and the corresponding distance under the given metric

Example

q).ai:use`kx.ai
q)vecs:{(x;y)#(x*y)?1e}[1000;10];
q)hnsw:.ai.hnsw.put[();();vecs;`L2;32;1%log 32;64];
q).ai.hnsw.search[vecs;hnsw;first vecs;5;`L2;32]

0 0.2077437 0.210438 0.216104 0.2323341
0 701       977      407      26

After creating an HNSW index, this example queries it with the first vector, requesting the 5 nearest neighbors. The results show exact self-match at index 0 followed by four close neighbors with their distances. This demonstrates the standard similarity search process in HNSW, optimized for high performance and scalability.

.ai.hnsw.upd

The .ai.hnsw.upd function updates one or more points at once with a new vector.

It allows efficient modification of stored data points without needing to delete and reinsert them. Updating nodes ensures that the index reflects the latest vector representations, which is particularly valuable for dynamic datasets.

Parameters

Name	Type(s)	Description
`hnsw`	(dict[]; dict[])	The HNSW object
`embs`	real[][]	The embeddings to update
`metric`	symbol	The distance metric, one of (`L2`; `CS`; `IP`)
`dps`	long \| long[]	The node ID or IDs of point(s) to be updated
`rvs`	real[] \| real[][]	The vector(s) replacing deleted vector(s)
`M`	short \| int \| long	The `M HNSW` hyperparameter
`ef`	short \| int \| long	The `ef Construction HNSW` hyperparameter

Returns

Type	Description
(dict[]; dict[])	Returns the HNSW object

Example

q).ai:use`kx.ai
q)vecs:{(x;y)#(x*y)?1e}[1000;10];
q)hnsw:.ai.hnsw.put[();();vecs;`L2;32;1%log 32;64];
q)hnsw:.ai.hnsw.upd[hnsw;vecs;`L2;13;10?1e;32;64];

Here an HNSW index is created and then updated at position 13 with a newly generated vector. The update avoids the overhead of deletion and reinsertion, allowing the index to evolve dynamically. This demonstrates how .ai.hnsw.upd is useful for maintaining accuracy in datasets where vectors need to be refreshed.