org.apache.lucene.index
Class BalancedSegmentMergePolicy

java.lang.Object
  extended by org.apache.lucene.index.MergePolicy
      extended by org.apache.lucene.index.LogMergePolicy
          extended by org.apache.lucene.index.LogByteSizeMergePolicy
              extended by org.apache.lucene.index.BalancedSegmentMergePolicy
All Implemented Interfaces:
Closeable

public class BalancedSegmentMergePolicy
extends LogByteSizeMergePolicy

Merge policy that tries to balance not doing large segment merges with not accumulating too many segments in the index, to provide for better performance in near real-time setting.

This is based on code from zoie, described in more detail at http://code.google.com/p/zoie/wiki/ZoieMergePolicy.


Nested Class Summary
static class BalancedSegmentMergePolicy.MergePolicyParams
           
 
Nested classes/interfaces inherited from class org.apache.lucene.index.MergePolicy
MergePolicy.MergeAbortedException, MergePolicy.MergeException, MergePolicy.MergeSpecification, MergePolicy.OneMerge
 
Field Summary
static int DEFAULT_NUM_LARGE_SEGMENTS
           
 
Fields inherited from class org.apache.lucene.index.LogByteSizeMergePolicy
DEFAULT_MAX_MERGE_MB, DEFAULT_MAX_MERGE_MB_FOR_OPTIMIZE, DEFAULT_MIN_MERGE_MB
 
Fields inherited from class org.apache.lucene.index.LogMergePolicy
calibrateSizeByDeletes, DEFAULT_MAX_MERGE_DOCS, DEFAULT_MERGE_FACTOR, DEFAULT_NO_CFS_RATIO, LEVEL_LOG_SPAN, maxMergeDocs, maxMergeSize, maxMergeSizeForOptimize, mergeFactor, minMergeSize, noCFSRatio, useCompoundFile
 
Fields inherited from class org.apache.lucene.index.MergePolicy
writer
 
Constructor Summary
BalancedSegmentMergePolicy()
           
 
Method Summary
 MergePolicy.MergeSpecification findMerges(SegmentInfos infos)
          Checks if any merges are now necessary and returns a MergePolicy.MergeSpecification if so.
 MergePolicy.MergeSpecification findMergesForOptimize(SegmentInfos infos, int maxNumSegments, Map<SegmentInfo,Boolean> segmentsToOptimize)
          Returns the merges necessary to optimize the index.
 MergePolicy.MergeSpecification findMergesToExpungeDeletes(SegmentInfos infos)
          Finds merges necessary to expunge all deletes from the index.
 int getMaxSmallSegments()
           
 int getNumLargeSegments()
           
 boolean getPartialExpunge()
           
 void setMaxSmallSegments(int maxSmallSegments)
           
 void setMergeFactor(int mergeFactor)
          Determines how often segment indices are merged by addDocument().
 void setMergePolicyParams(BalancedSegmentMergePolicy.MergePolicyParams params)
           
 void setNumLargeSegments(int numLargeSegments)
           
 void setPartialExpunge(boolean doPartialExpunge)
           
protected  long size(SegmentInfo info)
           
 
Methods inherited from class org.apache.lucene.index.LogByteSizeMergePolicy
getMaxMergeMB, getMaxMergeMBForOptimize, getMinMergeMB, setMaxMergeMB, setMaxMergeMBForOptimize, setMinMergeMB
 
Methods inherited from class org.apache.lucene.index.LogMergePolicy
close, getCalibrateSizeByDeletes, getMaxMergeDocs, getMergeFactor, getNoCFSRatio, getUseCompoundFile, isOptimized, isOptimized, message, setCalibrateSizeByDeletes, setMaxMergeDocs, setNoCFSRatio, setUseCompoundFile, sizeBytes, sizeDocs, toString, useCompoundFile, verbose
 
Methods inherited from class org.apache.lucene.index.MergePolicy
setIndexWriter
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

DEFAULT_NUM_LARGE_SEGMENTS

public static final int DEFAULT_NUM_LARGE_SEGMENTS
See Also:
Constant Field Values
Constructor Detail

BalancedSegmentMergePolicy

public BalancedSegmentMergePolicy()
Method Detail

setMergePolicyParams

public void setMergePolicyParams(BalancedSegmentMergePolicy.MergePolicyParams params)

size

protected long size(SegmentInfo info)
             throws IOException
Overrides:
size in class LogByteSizeMergePolicy
Throws:
IOException

setPartialExpunge

public void setPartialExpunge(boolean doPartialExpunge)

getPartialExpunge

public boolean getPartialExpunge()

setNumLargeSegments

public void setNumLargeSegments(int numLargeSegments)

getNumLargeSegments

public int getNumLargeSegments()

setMaxSmallSegments

public void setMaxSmallSegments(int maxSmallSegments)

getMaxSmallSegments

public int getMaxSmallSegments()

setMergeFactor

public void setMergeFactor(int mergeFactor)
Description copied from class: LogMergePolicy
Determines how often segment indices are merged by addDocument(). With smaller values, less RAM is used while indexing, and searches on unoptimized indices are faster, but indexing speed is slower. With larger values, more RAM is used during indexing, and while searches on unoptimized indices are slower, indexing is faster. Thus larger values (> 10) are best for batch index creation, and smaller values (< 10) for indices that are interactively maintained.

Overrides:
setMergeFactor in class LogMergePolicy

findMergesForOptimize

public MergePolicy.MergeSpecification findMergesForOptimize(SegmentInfos infos,
                                                            int maxNumSegments,
                                                            Map<SegmentInfo,Boolean> segmentsToOptimize)
                                                     throws IOException
Description copied from class: LogMergePolicy
Returns the merges necessary to optimize the index. This merge policy defines "optimized" to mean only the requested number of segments is left in the index, and respects the LogMergePolicy.maxMergeSizeForOptimize setting. By default, and assuming maxNumSegments=1, only one segment will be left in the index, where that segment has no deletions pending nor separate norms, and it is in compound file format if the current useCompoundFile setting is true. This method returns multiple merges (mergeFactor at a time) so the MergeScheduler in use may make use of concurrency.

Overrides:
findMergesForOptimize in class LogMergePolicy
Parameters:
infos - the total set of segments in the index
maxNumSegments - requested maximum number of segments in the index (currently this is always 1)
segmentsToOptimize - contains the specific SegmentInfo instances that must be merged away. This may be a subset of all SegmentInfos. If the value is True for a given SegmentInfo, that means this segment was an original segment present in the to-be-optimized index; else, it was a segment produced by a cascaded merge.
Throws:
IOException

findMergesToExpungeDeletes

public MergePolicy.MergeSpecification findMergesToExpungeDeletes(SegmentInfos infos)
                                                          throws CorruptIndexException,
                                                                 IOException
Description copied from class: LogMergePolicy
Finds merges necessary to expunge all deletes from the index. We simply merge adjacent segments that have deletes, up to mergeFactor at a time.

Overrides:
findMergesToExpungeDeletes in class LogMergePolicy
Parameters:
infos - the total set of segments in the index
Throws:
CorruptIndexException
IOException

findMerges

public MergePolicy.MergeSpecification findMerges(SegmentInfos infos)
                                          throws IOException
Description copied from class: LogMergePolicy
Checks if any merges are now necessary and returns a MergePolicy.MergeSpecification if so. A merge is necessary when there are more than LogMergePolicy.setMergeFactor(int) segments at a given level. When multiple levels have too many segments, this method will return multiple merges, allowing the MergeScheduler to use concurrency.

Overrides:
findMerges in class LogMergePolicy
Parameters:
infos - the total set of segments in the index
Throws:
IOException


Copyright © 2000-2011 Apache Software Foundation. All Rights Reserved.