org.apache.lucene.index
Class TieredMergePolicy

java.lang.Object
  extended by org.apache.lucene.index.MergePolicy
      extended by org.apache.lucene.index.TieredMergePolicy
All Implemented Interfaces:
Closeable

public class TieredMergePolicy
extends MergePolicy

Merges segments of approximately equal size, subject to an allowed number of segments per tier. This is similar to LogByteSizeMergePolicy, except this merge policy is able to merge non-adjacent segment, and separates how many segments are merged at once (setMaxMergeAtOnce(int)) from how many segments are allowed per tier (setSegmentsPerTier(double)). This merge policy also does not over-merge (ie, cascade merges).

For normal merging, this policy first computes a "budget" of how many segments are allowed by be in the index. If the index is over-budget, then the policy sorts segments by decresing size (pro-rating by percent deletes), and then finds the least-cost merge. Merge cost is measured by a combination of the "skew" of the merge (size of largest seg divided by smallest seg), total merge size and pct deletes reclaimed, so that merges with lower skew, smaller size and those reclaiming more deletes, are favored.

If a merge will produce a segment that's larger than setMaxMergedSegmentMB(double), then the policy will merge fewer segments (down to 1 at once, if that one has deletions) to keep the segment size under budget. NOTE: this policy freely merges non-adjacent segments; if this is a problem, use LogMergePolicy.

NOTE: This policy always merges by byte size of the segments, always pro-rates by percent deletes, and does not apply any maximum segment size during optimize (unlike LogByteSizeMergePolicy.

WARNING: This API is experimental and might change in incompatible ways in the next release.

Nested Class Summary
protected static class TieredMergePolicy.MergeScore
           
 
Nested classes/interfaces inherited from class org.apache.lucene.index.MergePolicy
MergePolicy.MergeAbortedException, MergePolicy.MergeException, MergePolicy.MergeSpecification, MergePolicy.OneMerge
 
Field Summary
 
Fields inherited from class org.apache.lucene.index.MergePolicy
writer
 
Constructor Summary
TieredMergePolicy()
           
 
Method Summary
 void close()
          Release all resources for the policy.
 MergePolicy.MergeSpecification findMerges(SegmentInfos infos)
          Determine what set of merge operations are now necessary on the index.
 MergePolicy.MergeSpecification findMergesForOptimize(SegmentInfos infos, int maxSegmentCount, Map<SegmentInfo,Boolean> segmentsToOptimize)
          Determine what set of merge operations is necessary in order to optimize the index.
 MergePolicy.MergeSpecification findMergesToExpungeDeletes(SegmentInfos infos)
          Determine what set of merge operations is necessary in order to expunge all deletes from the index.
 double getExpungeDeletesPctAllowed()
           
 double getFloorSegmentMB()
           
 int getMaxMergeAtOnce()
           
 int getMaxMergeAtOnceExplicit()
           
 double getMaxMergedSegmentMB()
           
 double getNoCFSRatio()
           
 double getReclaimDeletesWeight()
          See setReclaimDeletesWeight(double).
 double getSegmentsPerTier()
           
 boolean getUseCompoundFile()
           
protected  TieredMergePolicy.MergeScore score(List<SegmentInfo> candidate, boolean hitTooLarge, long mergingBytes)
          Expert: scores one merge; subclasses can override.
 TieredMergePolicy setExpungeDeletesPctAllowed(double v)
          When expungeDeletes is called, we only merge away a segment if its delete percentage is over this threshold.
 TieredMergePolicy setFloorSegmentMB(double v)
          Segments smaller than this are "rounded up" to this size, ie treated as equal (floor) size for merge selection.
 TieredMergePolicy setMaxMergeAtOnce(int v)
          Maximum number of segments to be merged at a time during "normal" merging.
 TieredMergePolicy setMaxMergeAtOnceExplicit(int v)
          Maximum number of segments to be merged at a time, during optimize or expungeDeletes.
 TieredMergePolicy setMaxMergedSegmentMB(double v)
          Maximum sized segment to produce during normal merging.
 TieredMergePolicy setNoCFSRatio(double noCFSRatio)
          If a merged segment will be more than this percentage of the total size of the index, leave the segment as non-compound file even if compound file is enabled.
 TieredMergePolicy setReclaimDeletesWeight(double v)
          Controls how aggressively merges that reclaim more deletions are favored.
 TieredMergePolicy setSegmentsPerTier(double v)
          Sets the allowed number of segments per tier.
 TieredMergePolicy setUseCompoundFile(boolean useCompoundFile)
          Sets whether compound file format should be used for newly flushed and newly merged segments.
 String toString()
           
 boolean useCompoundFile(SegmentInfos infos, SegmentInfo mergedInfo)
          Returns true if a new segment (regardless of its origin) should use the compound file format.
 
Methods inherited from class org.apache.lucene.index.MergePolicy
setIndexWriter
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

TieredMergePolicy

public TieredMergePolicy()
Method Detail

setMaxMergeAtOnce

public TieredMergePolicy setMaxMergeAtOnce(int v)
Maximum number of segments to be merged at a time during "normal" merging. For explicit merging (eg, optimize or expungeDeletes was called), see setMaxMergeAtOnceExplicit(int). Default is 10.


getMaxMergeAtOnce

public int getMaxMergeAtOnce()
See Also:
setMaxMergeAtOnce(int)

setMaxMergeAtOnceExplicit

public TieredMergePolicy setMaxMergeAtOnceExplicit(int v)
Maximum number of segments to be merged at a time, during optimize or expungeDeletes. Default is 30.


getMaxMergeAtOnceExplicit

public int getMaxMergeAtOnceExplicit()
See Also:
setMaxMergeAtOnceExplicit(int)

setMaxMergedSegmentMB

public TieredMergePolicy setMaxMergedSegmentMB(double v)
Maximum sized segment to produce during normal merging. This setting is approximate: the estimate of the merged segment size is made by summing sizes of to-be-merged segments (compensating for percent deleted docs). Default is 5 GB.


getMaxMergedSegmentMB

public double getMaxMergedSegmentMB()
See Also:
getMaxMergedSegmentMB()

setReclaimDeletesWeight

public TieredMergePolicy setReclaimDeletesWeight(double v)
Controls how aggressively merges that reclaim more deletions are favored. Higher values favor selecting merges that reclaim deletions. A value of 0.0 means deletions don't impact merge selection.


getReclaimDeletesWeight

public double getReclaimDeletesWeight()
See setReclaimDeletesWeight(double).


setFloorSegmentMB

public TieredMergePolicy setFloorSegmentMB(double v)
Segments smaller than this are "rounded up" to this size, ie treated as equal (floor) size for merge selection. This is to prevent frequent flushing of tiny segments from allowing a long tail in the index. Default is 2 MB.


getFloorSegmentMB

public double getFloorSegmentMB()
See Also:
setFloorSegmentMB(double)

setExpungeDeletesPctAllowed

public TieredMergePolicy setExpungeDeletesPctAllowed(double v)
When expungeDeletes is called, we only merge away a segment if its delete percentage is over this threshold. Default is 10%.


getExpungeDeletesPctAllowed

public double getExpungeDeletesPctAllowed()
See Also:
setExpungeDeletesPctAllowed(double)

setSegmentsPerTier

public TieredMergePolicy setSegmentsPerTier(double v)
Sets the allowed number of segments per tier. Smaller values mean more merging but fewer segments.

NOTE: this value should be >= the setMaxMergeAtOnce(int) otherwise you'll force too much merging to occur.

Default is 10.0.


getSegmentsPerTier

public double getSegmentsPerTier()
See Also:
setSegmentsPerTier(double)

setUseCompoundFile

public TieredMergePolicy setUseCompoundFile(boolean useCompoundFile)
Sets whether compound file format should be used for newly flushed and newly merged segments. Default true.


getUseCompoundFile

public boolean getUseCompoundFile()
See Also:
setUseCompoundFile(boolean)

setNoCFSRatio

public TieredMergePolicy setNoCFSRatio(double noCFSRatio)
If a merged segment will be more than this percentage of the total size of the index, leave the segment as non-compound file even if compound file is enabled. Set to 1.0 to always use CFS regardless of merge size. Default is 0.1.


getNoCFSRatio

public double getNoCFSRatio()
See Also:
setNoCFSRatio(double)

findMerges

public MergePolicy.MergeSpecification findMerges(SegmentInfos infos)
                                          throws IOException
Description copied from class: MergePolicy
Determine what set of merge operations are now necessary on the index. IndexWriter calls this whenever there is a change to the segments. This call is always synchronized on the IndexWriter instance so only one thread at a time will call this method.

Specified by:
findMerges in class MergePolicy
Parameters:
infos - the total set of segments in the index
Throws:
IOException

score

protected TieredMergePolicy.MergeScore score(List<SegmentInfo> candidate,
                                             boolean hitTooLarge,
                                             long mergingBytes)
                                      throws IOException
Expert: scores one merge; subclasses can override.

Throws:
IOException

findMergesForOptimize

public MergePolicy.MergeSpecification findMergesForOptimize(SegmentInfos infos,
                                                            int maxSegmentCount,
                                                            Map<SegmentInfo,Boolean> segmentsToOptimize)
                                                     throws IOException
Description copied from class: MergePolicy
Determine what set of merge operations is necessary in order to optimize the index. IndexWriter calls this when its IndexWriter.optimize() method is called. This call is always synchronized on the IndexWriter instance so only one thread at a time will call this method.

Specified by:
findMergesForOptimize in class MergePolicy
Parameters:
infos - the total set of segments in the index
maxSegmentCount - requested maximum number of segments in the index (currently this is always 1)
segmentsToOptimize - contains the specific SegmentInfo instances that must be merged away. This may be a subset of all SegmentInfos. If the value is True for a given SegmentInfo, that means this segment was an original segment present in the to-be-optimized index; else, it was a segment produced by a cascaded merge.
Throws:
IOException

findMergesToExpungeDeletes

public MergePolicy.MergeSpecification findMergesToExpungeDeletes(SegmentInfos infos)
                                                          throws CorruptIndexException,
                                                                 IOException
Description copied from class: MergePolicy
Determine what set of merge operations is necessary in order to expunge all deletes from the index.

Specified by:
findMergesToExpungeDeletes in class MergePolicy
Parameters:
infos - the total set of segments in the index
Throws:
CorruptIndexException
IOException

useCompoundFile

public boolean useCompoundFile(SegmentInfos infos,
                               SegmentInfo mergedInfo)
                        throws IOException
Description copied from class: MergePolicy
Returns true if a new segment (regardless of its origin) should use the compound file format.

Specified by:
useCompoundFile in class MergePolicy
Throws:
IOException

close

public void close()
Description copied from class: MergePolicy
Release all resources for the policy.

Specified by:
close in interface Closeable
Specified by:
close in class MergePolicy

toString

public String toString()
Overrides:
toString in class Object


Copyright © 2000-2011 Apache Software Foundation. All Rights Reserved.