Package org.apache.lucene.util.hnsw
Class HnswGraph
- java.lang.Object
-
- org.apache.lucene.index.KnnGraphValues
-
- org.apache.lucene.util.hnsw.HnswGraph
-
public final class HnswGraph extends KnnGraphValues
Navigable Small-world graph. Provides efficient approximate nearest neighbor search for high dimensional vectors. See Approximate nearest neighbor algorithm based on navigable small world graphs [2014] and this paper [2018] for details.The nomenclature is a bit different here from what's used in those papers:
Hyperparameters
numSeed
is the equivalent ofm
in the 2012 paper; it controls the number of random entry points to sample.beamWidth
inHnswGraphBuilder
has the same meaning asefConst
in the 2016 paper. It is the number of nearest neighbor candidates to track while searching the graph for each newly inserted node.maxConn
has the same meaning asM
in the later paper; it controls how many of theefConst
neighbors are connected to the new node
Note: The graph may be searched by multiple threads concurrently, but updates are not thread-safe. Also note: there is no notion of deletions. Document searching built on top of this must do its own deletion-filtering.
-
-
Field Summary
-
Fields inherited from class org.apache.lucene.index.KnnGraphValues
EMPTY
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description NeighborArray
getNeighbors(int node)
Returns theNeighborQueue
connected to the given node.int
nextNeighbor()
Iterates over the neighbor list.static NeighborQueue
search(float[] query, int topK, int numSeed, RandomAccessVectorValues vectors, VectorSimilarityFunction similarityFunction, KnnGraphValues graphValues, Bits acceptOrds, SplittableRandom random)
Searches for the nearest neighbors of a query vector.void
seek(int targetNode)
Move the pointer to exactlytarget
, the id of a node in the graph.int
size()
Returns the number of nodes in the graph
-
-
-
Method Detail
-
search
public static NeighborQueue search(float[] query, int topK, int numSeed, RandomAccessVectorValues vectors, VectorSimilarityFunction similarityFunction, KnnGraphValues graphValues, Bits acceptOrds, SplittableRandom random) throws IOException
Searches for the nearest neighbors of a query vector.- Parameters:
query
- search query vectortopK
- the number of nodes to be returnednumSeed
- the size of the queue maintained while searching, and controls the number of random entry points to samplevectors
- vector valuesgraphValues
- the graph values. May represent the entire graph, or a level in a hierarchical graph.acceptOrds
-Bits
that represents the allowed document ordinals to match, ornull
if they are all allowed to match.random
- a source of randomness, used for generating entry points to the graph- Returns:
- a priority queue holding the closest neighbors found
- Throws:
IOException
-
getNeighbors
public NeighborArray getNeighbors(int node)
Returns theNeighborQueue
connected to the given node.- Parameters:
node
- the node whose neighbors are returned
-
size
public int size()
Description copied from class:KnnGraphValues
Returns the number of nodes in the graph- Specified by:
size
in classKnnGraphValues
-
seek
public void seek(int targetNode)
Description copied from class:KnnGraphValues
Move the pointer to exactlytarget
, the id of a node in the graph. After this method returns, callKnnGraphValues.nextNeighbor()
to return successive (ordered) connected node ordinals.- Specified by:
seek
in classKnnGraphValues
- Parameters:
targetNode
- must be a valid node in the graph, ie. ≥ 0 and <VectorValues.size()
.
-
nextNeighbor
public int nextNeighbor()
Description copied from class:KnnGraphValues
Iterates over the neighbor list. It is illegal to call this method after it returns NO_MORE_DOCS without callingKnnGraphValues.seek(int)
, which resets the iterator.- Specified by:
nextNeighbor
in classKnnGraphValues
- Returns:
- a node ordinal in the graph, or NO_MORE_DOCS if the iteration is complete.
-
-