public class TieredMergePolicy extends MergePolicy
LogByteSizeMergePolicy
, except this merge
policy is able to merge non-adjacent segment, and
separates how many segments are merged at once (setMaxMergeAtOnce(int)
) from how many segments are allowed
per tier (setSegmentsPerTier(double)
). This merge
policy also does not over-merge (i.e. cascade merges).
For normal merging, this policy first computes a "budget" of how many segments are allowed by be in the index. If the index is over-budget, then the policy sorts segments by decreasing size (pro-rating by percent deletes), and then finds the least-cost merge. Merge cost is measured by a combination of the "skew" of the merge (size of largest segment divided by smallest segment), total merge size and percent deletes reclaimed, so that merges with lower skew, smaller size and those reclaiming more deletes, are favored.
If a merge will produce a segment that's larger than
setMaxMergedSegmentMB(double)
, then the policy will
merge fewer segments (down to 1 at once, if that one has
deletions) to keep the segment size under budget.
NOTE: this policy freely merges non-adjacent
segments; if this is a problem, use LogMergePolicy
.
NOTE: This policy always merges by byte size
of the segments, always pro-rates by percent deletes,
and does not apply any maximum segment size during
forceMerge (unlike LogByteSizeMergePolicy
).
Modifier and Type | Class and Description |
---|---|
protected static class |
TieredMergePolicy.MergeScore
Holds score and explanation for a single candidate
merge.
|
MergePolicy.MergeAbortedException, MergePolicy.MergeException, MergePolicy.MergeSpecification, MergePolicy.OneMerge
writer
Constructor and Description |
---|
TieredMergePolicy() |
Modifier and Type | Method and Description |
---|---|
void |
close()
Release all resources for the policy.
|
MergePolicy.MergeSpecification |
findForcedDeletesMerges(SegmentInfos infos)
Determine what set of merge operations is necessary in order to expunge all
deletes from the index.
|
MergePolicy.MergeSpecification |
findForcedMerges(SegmentInfos infos,
int maxSegmentCount,
Map<SegmentInfo,Boolean> segmentsToMerge)
Determine what set of merge operations is necessary in
order to merge to <= the specified segment count.
|
MergePolicy.MergeSpecification |
findMerges(SegmentInfos infos)
Determine what set of merge operations are now necessary on the index.
|
double |
getFloorSegmentMB() |
double |
getForceMergeDeletesPctAllowed() |
int |
getMaxMergeAtOnce() |
int |
getMaxMergeAtOnceExplicit() |
double |
getMaxMergedSegmentMB() |
double |
getNoCFSRatio() |
double |
getReclaimDeletesWeight()
|
double |
getSegmentsPerTier() |
boolean |
getUseCompoundFile() |
protected TieredMergePolicy.MergeScore |
score(List<SegmentInfo> candidate,
boolean hitTooLarge,
long mergingBytes)
Expert: scores one merge; subclasses can override.
|
TieredMergePolicy |
setFloorSegmentMB(double v)
Segments smaller than this are "rounded up" to this
size, ie treated as equal (floor) size for merge
selection.
|
TieredMergePolicy |
setForceMergeDeletesPctAllowed(double v)
When forceMergeDeletes is called, we only merge away a
segment if its delete percentage is over this
threshold.
|
TieredMergePolicy |
setMaxMergeAtOnce(int v)
Maximum number of segments to be merged at a time
during "normal" merging.
|
TieredMergePolicy |
setMaxMergeAtOnceExplicit(int v)
Maximum number of segments to be merged at a time,
during forceMerge or forceMergeDeletes.
|
TieredMergePolicy |
setMaxMergedSegmentMB(double v)
Maximum sized segment to produce during
normal merging.
|
TieredMergePolicy |
setNoCFSRatio(double noCFSRatio)
If a merged segment will be more than this percentage
of the total size of the index, leave the segment as
non-compound file even if compound file is enabled.
|
TieredMergePolicy |
setReclaimDeletesWeight(double v)
Controls how aggressively merges that reclaim more
deletions are favored.
|
TieredMergePolicy |
setSegmentsPerTier(double v)
Sets the allowed number of segments per tier.
|
TieredMergePolicy |
setUseCompoundFile(boolean useCompoundFile)
Sets whether compound file format should be used for
newly flushed and newly merged segments.
|
String |
toString() |
boolean |
useCompoundFile(SegmentInfos infos,
SegmentInfo mergedInfo)
Returns true if a new segment (regardless of its origin) should use the compound file format.
|
setIndexWriter
public TieredMergePolicy setMaxMergeAtOnce(int v)
setMaxMergeAtOnceExplicit(int)
. Default is 10.public int getMaxMergeAtOnce()
setMaxMergeAtOnce(int)
public TieredMergePolicy setMaxMergeAtOnceExplicit(int v)
public int getMaxMergeAtOnceExplicit()
setMaxMergeAtOnceExplicit(int)
public TieredMergePolicy setMaxMergedSegmentMB(double v)
public double getMaxMergedSegmentMB()
getMaxMergedSegmentMB()
public TieredMergePolicy setReclaimDeletesWeight(double v)
public double getReclaimDeletesWeight()
public TieredMergePolicy setFloorSegmentMB(double v)
public double getFloorSegmentMB()
setFloorSegmentMB(double)
public TieredMergePolicy setForceMergeDeletesPctAllowed(double v)
public double getForceMergeDeletesPctAllowed()
public TieredMergePolicy setSegmentsPerTier(double v)
NOTE: this value should be >= the setMaxMergeAtOnce(int)
otherwise you'll force too much
merging to occur.
Default is 10.0.
public double getSegmentsPerTier()
setSegmentsPerTier(double)
public TieredMergePolicy setUseCompoundFile(boolean useCompoundFile)
public boolean getUseCompoundFile()
setUseCompoundFile(boolean)
public TieredMergePolicy setNoCFSRatio(double noCFSRatio)
public double getNoCFSRatio()
setNoCFSRatio(double)
public MergePolicy.MergeSpecification findMerges(SegmentInfos infos) throws IOException
MergePolicy
IndexWriter
calls this whenever there is a change to the segments.
This call is always synchronized on the IndexWriter
instance so
only one thread at a time will call this method.findMerges
in class MergePolicy
infos
- the total set of segments in the indexIOException
protected TieredMergePolicy.MergeScore score(List<SegmentInfo> candidate, boolean hitTooLarge, long mergingBytes) throws IOException
IOException
public MergePolicy.MergeSpecification findForcedMerges(SegmentInfos infos, int maxSegmentCount, Map<SegmentInfo,Boolean> segmentsToMerge) throws IOException
MergePolicy
IndexWriter
calls this when its
IndexWriter.forceMerge(int)
method is called. This call is always
synchronized on the IndexWriter
instance so only one thread at a
time will call this method.findForcedMerges
in class MergePolicy
infos
- the total set of segments in the indexmaxSegmentCount
- requested maximum number of segments in the index (currently this
is always 1)segmentsToMerge
- contains the specific SegmentInfo instances that must be merged
away. This may be a subset of all
SegmentInfos. If the value is True for a
given SegmentInfo, that means this segment was
an original segment present in the
to-be-merged index; else, it was a segment
produced by a cascaded merge.IOException
public MergePolicy.MergeSpecification findForcedDeletesMerges(SegmentInfos infos) throws CorruptIndexException, IOException
MergePolicy
findForcedDeletesMerges
in class MergePolicy
infos
- the total set of segments in the indexCorruptIndexException
IOException
public boolean useCompoundFile(SegmentInfos infos, SegmentInfo mergedInfo) throws IOException
MergePolicy
useCompoundFile
in class MergePolicy
IOException
public void close()
MergePolicy
close
in interface Closeable
close
in class MergePolicy