Package org.apache.lucene.index
Class LogMergePolicy
java.lang.Object
org.apache.lucene.index.MergePolicy
org.apache.lucene.index.LogMergePolicy
- Direct Known Subclasses:
LogByteSizeMergePolicy
,LogDocMergePolicy
This class implements a
MergePolicy
that tries to merge segments into levels of
exponentially increasing size, where each level has fewer segments than the value of the merge
factor. Whenever extra segments (beyond the merge factor upper bound) are encountered, all
segments within the level are merged. You can get or set the merge factor using getMergeFactor()
and setMergeFactor(int)
respectively.
This class is abstract and requires a subclass to define the MergePolicy.size(org.apache.lucene.index.SegmentCommitInfo, org.apache.lucene.index.MergePolicy.MergeContext)
method which
specifies how a segment's size is determined. LogDocMergePolicy
is one subclass that
measures size by document count in the segment. LogByteSizeMergePolicy
is another
subclass that measures size as the total byte size of the file(s) for the segment.
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.lucene.index.MergePolicy
MergePolicy.MergeAbortedException, MergePolicy.MergeContext, MergePolicy.MergeException, MergePolicy.MergeSpecification, MergePolicy.OneMerge, MergePolicy.OneMergeProgress
-
Field Summary
Modifier and TypeFieldDescriptionprotected boolean
If true, we pro-rate a segment's size by the percentage of non-deleted documents.static final int
Default maximum segment size.static final int
Default merge factor, which is how many segments are merged at a timestatic final double
Default noCFSRatio.static final double
Defines the allowed range of log(size) for each level.protected int
If a segment has more than this many documents then it will never be merged.protected long
If the size of a segment exceeds this value then it will never be merged.protected long
If the size of a segment exceeds this value then it will never be merged duringIndexWriter.forceMerge(int)
.protected int
How many segments to merge at a time.protected long
Any segments whose size is smaller than this value will be rounded up to this value.Fields inherited from class org.apache.lucene.index.MergePolicy
DEFAULT_MAX_CFS_SEGMENT_SIZE, maxCFSSegmentSize, noCFSRatio
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionfindForcedDeletesMerges
(SegmentInfos segmentInfos, MergePolicy.MergeContext mergeContext) Finds merges necessary to force-merge all deletes from the index.findForcedMerges
(SegmentInfos infos, int maxNumSegments, Map<SegmentCommitInfo, Boolean> segmentsToMerge, MergePolicy.MergeContext mergeContext) Returns the merges necessary to merge the index down to a specified number of segments.findMerges
(MergeTrigger mergeTrigger, SegmentInfos infos, MergePolicy.MergeContext mergeContext) Checks if any merges are now necessary and returns aMergePolicy.MergeSpecification
if so.boolean
Returns true if the segment size should be calibrated by the number of deletes when choosing segments for merge.int
Returns the largest segment (measured by document count) that may be merged with other segments.int
Returns the number of segments that are merged at once and also controls the total number of segments allowed to accumulate in the index.protected boolean
isMerged
(SegmentInfos infos, int maxNumSegments, Map<SegmentCommitInfo, Boolean> segmentsToMerge, MergePolicy.MergeContext mergeContext) Returns true if the number of segments eligible for merging is less than or equal to the specifiedmaxNumSegments
.void
setCalibrateSizeByDeletes
(boolean calibrateSizeByDeletes) Sets whether the segment size should be calibrated by the number of deletes when choosing segments for merge.void
setMaxMergeDocs
(int maxMergeDocs) Determines the largest segment (measured by document count) that may be merged with other segments.void
setMergeFactor
(int mergeFactor) Determines how often segment indices are merged by addDocument().protected long
sizeBytes
(SegmentCommitInfo info, MergePolicy.MergeContext mergeContext) Return the byte size of the providedSegmentCommitInfo
, pro-rated by percentage of non-deleted documents ifsetCalibrateSizeByDeletes(boolean)
is set.protected long
sizeDocs
(SegmentCommitInfo info, MergePolicy.MergeContext mergeContext) Return the number of documents in the providedSegmentCommitInfo
, pro-rated by percentage of non-deleted documents ifsetCalibrateSizeByDeletes(boolean)
is set.toString()
Methods inherited from class org.apache.lucene.index.MergePolicy
assertDelCount, findFullFlushMerges, getMaxCFSSegmentSizeMB, getNoCFSRatio, isMerged, keepFullyDeletedSegment, message, numDeletesToMerge, segString, setMaxCFSSegmentSizeMB, setNoCFSRatio, size, useCompoundFile, verbose
-
Field Details
-
LEVEL_LOG_SPAN
public static final double LEVEL_LOG_SPANDefines the allowed range of log(size) for each level. A level is computed by taking the max segment log size, minus LEVEL_LOG_SPAN, and finding all segments falling within that range.- See Also:
-
DEFAULT_MERGE_FACTOR
public static final int DEFAULT_MERGE_FACTORDefault merge factor, which is how many segments are merged at a time- See Also:
-
DEFAULT_MAX_MERGE_DOCS
public static final int DEFAULT_MAX_MERGE_DOCSDefault maximum segment size. A segment of this size or larger will never be merged. @see setMaxMergeDocs- See Also:
-
DEFAULT_NO_CFS_RATIO
public static final double DEFAULT_NO_CFS_RATIODefault noCFSRatio. If a merge's size is>= 10%
of the index, then we disable compound file for it. -
mergeFactor
protected int mergeFactorHow many segments to merge at a time. -
minMergeSize
protected long minMergeSizeAny segments whose size is smaller than this value will be rounded up to this value. This ensures that tiny segments are aggressively merged. -
maxMergeSize
protected long maxMergeSizeIf the size of a segment exceeds this value then it will never be merged. -
maxMergeSizeForForcedMerge
protected long maxMergeSizeForForcedMergeIf the size of a segment exceeds this value then it will never be merged duringIndexWriter.forceMerge(int)
. -
maxMergeDocs
protected int maxMergeDocsIf a segment has more than this many documents then it will never be merged. -
calibrateSizeByDeletes
protected boolean calibrateSizeByDeletesIf true, we pro-rate a segment's size by the percentage of non-deleted documents.
-
-
Constructor Details
-
LogMergePolicy
public LogMergePolicy()Sole constructor. (For invocation by subclass constructors, typically implicit.)
-
-
Method Details
-
getMergeFactor
public int getMergeFactor()Returns the number of segments that are merged at once and also controls the total number of segments allowed to accumulate in the index. -
setMergeFactor
public void setMergeFactor(int mergeFactor) Determines how often segment indices are merged by addDocument(). With smaller values, less RAM is used while indexing, and searches are faster, but indexing speed is slower. With larger values, more RAM is used during indexing, and while searches is slower, indexing is faster. Thus larger values (> 10
) are best for batch index creation, and smaller values (< 10
) for indices that are interactively maintained. -
setCalibrateSizeByDeletes
public void setCalibrateSizeByDeletes(boolean calibrateSizeByDeletes) Sets whether the segment size should be calibrated by the number of deletes when choosing segments for merge. -
getCalibrateSizeByDeletes
public boolean getCalibrateSizeByDeletes()Returns true if the segment size should be calibrated by the number of deletes when choosing segments for merge. -
sizeDocs
protected long sizeDocs(SegmentCommitInfo info, MergePolicy.MergeContext mergeContext) throws IOException Return the number of documents in the providedSegmentCommitInfo
, pro-rated by percentage of non-deleted documents ifsetCalibrateSizeByDeletes(boolean)
is set.- Throws:
IOException
-
sizeBytes
protected long sizeBytes(SegmentCommitInfo info, MergePolicy.MergeContext mergeContext) throws IOException Return the byte size of the providedSegmentCommitInfo
, pro-rated by percentage of non-deleted documents ifsetCalibrateSizeByDeletes(boolean)
is set.- Throws:
IOException
-
isMerged
protected boolean isMerged(SegmentInfos infos, int maxNumSegments, Map<SegmentCommitInfo, Boolean> segmentsToMerge, MergePolicy.MergeContext mergeContext) throws IOExceptionReturns true if the number of segments eligible for merging is less than or equal to the specifiedmaxNumSegments
.- Throws:
IOException
-
findForcedMerges
public MergePolicy.MergeSpecification findForcedMerges(SegmentInfos infos, int maxNumSegments, Map<SegmentCommitInfo, Boolean> segmentsToMerge, MergePolicy.MergeContext mergeContext) throws IOExceptionReturns the merges necessary to merge the index down to a specified number of segments. This respects themaxMergeSizeForForcedMerge
setting. By default, and assumingmaxNumSegments=1
, only one segment will be left in the index, where that segment has no deletions pending nor separate norms, and it is in compound file format if the current useCompoundFile setting is true. This method returns multiple merges (mergeFactor at a time) so theMergeScheduler
in use may make use of concurrency.- Specified by:
findForcedMerges
in classMergePolicy
- Parameters:
infos
- the total set of segments in the indexmaxNumSegments
- requested maximum number of segments in the indexsegmentsToMerge
- contains the specific SegmentInfo instances that must be merged away. This may be a subset of all SegmentInfos. If the value is True for a given SegmentInfo, that means this segment was an original segment present in the to-be-merged index; else, it was a segment produced by a cascaded merge.mergeContext
- the MergeContext to find the merges on- Throws:
IOException
-
findForcedDeletesMerges
public MergePolicy.MergeSpecification findForcedDeletesMerges(SegmentInfos segmentInfos, MergePolicy.MergeContext mergeContext) throws IOException Finds merges necessary to force-merge all deletes from the index. We simply merge adjacent segments that have deletes, up to mergeFactor at a time.- Specified by:
findForcedDeletesMerges
in classMergePolicy
- Parameters:
segmentInfos
- the total set of segments in the indexmergeContext
- the MergeContext to find the merges on- Throws:
IOException
-
findMerges
public MergePolicy.MergeSpecification findMerges(MergeTrigger mergeTrigger, SegmentInfos infos, MergePolicy.MergeContext mergeContext) throws IOException Checks if any merges are now necessary and returns aMergePolicy.MergeSpecification
if so. A merge is necessary when there are more thansetMergeFactor(int)
segments at a given level. When multiple levels have too many segments, this method will return multiple merges, allowing theMergeScheduler
to use concurrency.- Specified by:
findMerges
in classMergePolicy
- Parameters:
mergeTrigger
- the event that triggered the mergeinfos
- the total set of segments in the indexmergeContext
- the IndexWriter to find the merges on- Throws:
IOException
-
setMaxMergeDocs
public void setMaxMergeDocs(int maxMergeDocs) Determines the largest segment (measured by document count) that may be merged with other segments. Small values (e.g., less than 10,000) are best for interactive indexing, as this limits the length of pauses while indexing to a few seconds. Larger values are best for batched indexing and speedier searches.The default value is
Integer.MAX_VALUE
.The default merge policy (
LogByteSizeMergePolicy
) also allows you to set this limit by net size (in MB) of the segment, usingLogByteSizeMergePolicy.setMaxMergeMB(double)
. -
getMaxMergeDocs
public int getMaxMergeDocs()Returns the largest segment (measured by document count) that may be merged with other segments.- See Also:
-
toString
-