Package org.apache.lucene.util.hnsw
Class MergingHnswGraphBuilder
java.lang.Object
org.apache.lucene.util.hnsw.HnswGraphBuilder
org.apache.lucene.util.hnsw.MergingHnswGraphBuilder
- All Implemented Interfaces:
HnswBuilder
A graph builder that is used during segments' merging.
This builder uses a smart algorithm to merge multiple graphs into a single graph. The algorithm is based on the idea that if we know where we want to insert a node, we have a good idea of where we want to insert its neighbors.
The algorithm is based on the following steps:
- Get all graphs that don't have deletions and sort them by size desc.
- Copy the largest graph to the new graph (gL).
- For each remaining small graph (gS):
- Find the nodes that best cover gS: join set `j`. These nodes will be inserted into gL as usual: by searching gL to find the best candidates `w` to which connect the nodes.
- For each remaining node in gS:
- We provide eps to search in gL. We form `eps` by the union of the node's neighbors in gS and the node's neighbors' neighbors in gL. We also limit beamWidth (efConstruction to M*3)
We expect the size of join set `j` to be small, around 1/5 to 1/2 of the size of gS. For the rest of the nodes in gS, we expect savings by performing lighter searches in gL.
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.lucene.util.hnsw.HnswGraphBuilder
HnswGraphBuilder.GraphBuilderKnnCollector
-
Field Summary
Fields inherited from class org.apache.lucene.util.hnsw.HnswGraphBuilder
DEFAULT_BEAM_WIDTH, DEFAULT_MAX_CONN, frozen, hnsw, HNSW_COMPONENT, hnswLock, infoStream, randSeed, scorerSupplier
-
Method Summary
Modifier and TypeMethodDescriptionbuild
(int maxOrd) Adds all nodes to the graph up to the providedmaxOrd
.static MergingHnswGraphBuilder
fromGraphs
(RandomVectorScorerSupplier scorerSupplier, int M, int beamWidth, long seed, HnswGraph[] graphs, int[][] ordMaps, int totalNumberOfVectors, BitSet initializedNodes) Create a new HnswGraphBuilder that is initialized with the provided HnswGraph.Methods inherited from class org.apache.lucene.util.hnsw.HnswGraphBuilder
addGraphNode, addGraphNode, addGraphNodeWithEps, addVectors, create, create, getCompletedGraph, getGraph, setInfoStream
-
Method Details
-
fromGraphs
public static MergingHnswGraphBuilder fromGraphs(RandomVectorScorerSupplier scorerSupplier, int M, int beamWidth, long seed, HnswGraph[] graphs, int[][] ordMaps, int totalNumberOfVectors, BitSet initializedNodes) throws IOException Create a new HnswGraphBuilder that is initialized with the provided HnswGraph.- Parameters:
scorerSupplier
- the scorer to use for vectorsbeamWidth
- the number of nodes to explore in the searchseed
- the seed for the random number generatorgraphs
- the graphs to mergeordMaps
- the ordinal maps for the graphstotalNumberOfVectors
- the total number of vectors in the new graph, this should include all vectors expected to be added to the graph in the futureinitializedNodes
- the nodes will be initialized through the merging- Returns:
- a new HnswGraphBuilder that is initialized with the provided HnswGraph
- Throws:
IOException
- when reading the graph fails
-
build
Description copied from interface:HnswBuilder
Adds all nodes to the graph up to the providedmaxOrd
.- Specified by:
build
in interfaceHnswBuilder
- Overrides:
build
in classHnswGraphBuilder
- Parameters:
maxOrd
- The maximum ordinal (excluded) of the nodes to be added.- Throws:
IOException
-