org.apache.lucene.index
Class MergePolicy

java.lang.Object
  extended by org.apache.lucene.index.MergePolicy
All Implemented Interfaces:
Closeable, Cloneable
Direct Known Subclasses:
LogMergePolicy, NoMergePolicy, TieredMergePolicy, UpgradeIndexMergePolicy

public abstract class MergePolicy
extends Object
implements Closeable, Cloneable

Expert: a MergePolicy determines the sequence of primitive merge operations.

Whenever the segments in an index have been altered by IndexWriter, either the addition of a newly flushed segment, addition of many segments from addIndexes* calls, or a previous merge that may now need to cascade, IndexWriter invokes findMerges(org.apache.lucene.index.MergePolicy.MergeTrigger, org.apache.lucene.index.SegmentInfos) to give the MergePolicy a chance to pick merges that are now required. This method returns a MergePolicy.MergeSpecification instance describing the set of merges that should be done, or null if no merges are necessary. When IndexWriter.forceMerge is called, it calls findForcedMerges(SegmentInfos,int,Map) and the MergePolicy should then return the necessary merges.

Note that the policy can return more than one merge at a time. In this case, if the writer is using SerialMergeScheduler, the merges will be run sequentially but if it is using ConcurrentMergeScheduler they will be run concurrently.

The default MergePolicy is TieredMergePolicy.

WARNING: This API is experimental and might change in incompatible ways in the next release.

Nested Class Summary
static class MergePolicy.DocMap
          A map of doc IDs.
static class MergePolicy.MergeAbortedException
          Thrown when a merge was explicity aborted because IndexWriter.close(boolean) was called with false.
static class MergePolicy.MergeException
          Exception thrown if there are any problems while executing a merge.
static class MergePolicy.MergeSpecification
          A MergeSpecification instance provides the information necessary to perform multiple merges.
static class MergePolicy.MergeTrigger
          MergeTrigger is passed to findMerges(MergeTrigger, SegmentInfos) to indicate the event that triggered the merge.
static class MergePolicy.OneMerge
          OneMerge provides the information necessary to perform an individual primitive merge operation, resulting in a single new segment.
 
Field Summary
protected static long DEFAULT_MAX_CFS_SEGMENT_SIZE
          Default max segment size in order to use compound file system.
protected static double DEFAULT_NO_CFS_RATIO
          Default ratio for compound file system usage.
protected  long maxCFSSegmentSize
          If the size of the merged segment exceeds this value then it will not use compound file format.
protected  double noCFSRatio
          If the size of the merge segment exceeds this ratio of the total index size then it will remain in non-compound format
protected  SetOnce<IndexWriter> writer
          IndexWriter that contains this instance.
 
Constructor Summary
  MergePolicy()
          Creates a new merge policy instance.
protected MergePolicy(double defaultNoCFSRatio, long defaultMaxCFSSegmentSize)
          Creates a new merge policy instance with default settings for noCFSRatio and maxCFSSegmentSize.
 
Method Summary
 MergePolicy clone()
           
abstract  void close()
          Release all resources for the policy.
abstract  MergePolicy.MergeSpecification findForcedDeletesMerges(SegmentInfos segmentInfos)
          Determine what set of merge operations is necessary in order to expunge all deletes from the index.
abstract  MergePolicy.MergeSpecification findForcedMerges(SegmentInfos segmentInfos, int maxSegmentCount, Map<SegmentCommitInfo,Boolean> segmentsToMerge)
          Determine what set of merge operations is necessary in order to merge to <= the specified segment count.
abstract  MergePolicy.MergeSpecification findMerges(MergePolicy.MergeTrigger mergeTrigger, SegmentInfos segmentInfos)
          Determine what set of merge operations are now necessary on the index.
 double getMaxCFSSegmentSizeMB()
          Returns the largest size allowed for a compound file segment
 double getNoCFSRatio()
          Returns current noCFSRatio.
protected  boolean isMerged(SegmentInfos infos, SegmentCommitInfo info)
          Returns true if this single info is already fully merged (has no pending deletes, is in the same dir as the writer, and matches the current compound file setting
 void setIndexWriter(IndexWriter writer)
          Sets the IndexWriter to use by this merge policy.
 void setMaxCFSSegmentSizeMB(double v)
          If a merged segment will be more than this value, leave the segment as non-compound file even if compound file is enabled.
 void setNoCFSRatio(double noCFSRatio)
          If a merged segment will be more than this percentage of the total size of the index, leave the segment as non-compound file even if compound file is enabled.
protected  long size(SegmentCommitInfo info)
          Return the byte size of the provided SegmentCommitInfo, pro-rated by percentage of non-deleted documents is set.
 boolean useCompoundFile(SegmentInfos infos, SegmentCommitInfo mergedInfo)
          Returns true if a new segment (regardless of its origin) should use the compound file format.
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT_NO_CFS_RATIO

protected static final double DEFAULT_NO_CFS_RATIO
Default ratio for compound file system usage. Set to 1.0, always use compound file system.

See Also:
Constant Field Values

DEFAULT_MAX_CFS_SEGMENT_SIZE

protected static final long DEFAULT_MAX_CFS_SEGMENT_SIZE
Default max segment size in order to use compound file system. Set to Long.MAX_VALUE.

See Also:
Constant Field Values

writer

protected SetOnce<IndexWriter> writer
IndexWriter that contains this instance.


noCFSRatio

protected double noCFSRatio
If the size of the merge segment exceeds this ratio of the total index size then it will remain in non-compound format


maxCFSSegmentSize

protected long maxCFSSegmentSize
If the size of the merged segment exceeds this value then it will not use compound file format.

Constructor Detail

MergePolicy

public MergePolicy()
Creates a new merge policy instance. Note that if you intend to use it without passing it to IndexWriter, you should call setIndexWriter(IndexWriter).


MergePolicy

protected MergePolicy(double defaultNoCFSRatio,
                      long defaultMaxCFSSegmentSize)
Creates a new merge policy instance with default settings for noCFSRatio and maxCFSSegmentSize. This ctor should be used by subclasses using different defaults than the MergePolicy

Method Detail

clone

public MergePolicy clone()
Overrides:
clone in class Object

setIndexWriter

public void setIndexWriter(IndexWriter writer)
Sets the IndexWriter to use by this merge policy. This method is allowed to be called only once, and is usually set by IndexWriter. If it is called more than once, SetOnce.AlreadySetException is thrown.

See Also:
SetOnce

findMerges

public abstract MergePolicy.MergeSpecification findMerges(MergePolicy.MergeTrigger mergeTrigger,
                                                          SegmentInfos segmentInfos)
                                                   throws IOException
Determine what set of merge operations are now necessary on the index. IndexWriter calls this whenever there is a change to the segments. This call is always synchronized on the IndexWriter instance so only one thread at a time will call this method.

Parameters:
mergeTrigger - the event that triggered the merge
segmentInfos - the total set of segments in the index
Throws:
IOException

findForcedMerges

public abstract MergePolicy.MergeSpecification findForcedMerges(SegmentInfos segmentInfos,
                                                                int maxSegmentCount,
                                                                Map<SegmentCommitInfo,Boolean> segmentsToMerge)
                                                         throws IOException
Determine what set of merge operations is necessary in order to merge to <= the specified segment count. IndexWriter calls this when its IndexWriter.forceMerge(int) method is called. This call is always synchronized on the IndexWriter instance so only one thread at a time will call this method.

Parameters:
segmentInfos - the total set of segments in the index
maxSegmentCount - requested maximum number of segments in the index (currently this is always 1)
segmentsToMerge - contains the specific SegmentInfo instances that must be merged away. This may be a subset of all SegmentInfos. If the value is True for a given SegmentInfo, that means this segment was an original segment present in the to-be-merged index; else, it was a segment produced by a cascaded merge.
Throws:
IOException

findForcedDeletesMerges

public abstract MergePolicy.MergeSpecification findForcedDeletesMerges(SegmentInfos segmentInfos)
                                                                throws IOException
Determine what set of merge operations is necessary in order to expunge all deletes from the index.

Parameters:
segmentInfos - the total set of segments in the index
Throws:
IOException

close

public abstract void close()
Release all resources for the policy.

Specified by:
close in interface Closeable

useCompoundFile

public boolean useCompoundFile(SegmentInfos infos,
                               SegmentCommitInfo mergedInfo)
                        throws IOException
Returns true if a new segment (regardless of its origin) should use the compound file format. The default implementation returns true iff the size of the given mergedInfo is less or equal to getMaxCFSSegmentSizeMB() and the size is less or equal to the TotalIndexSize * getNoCFSRatio() otherwise false.

Throws:
IOException

size

protected long size(SegmentCommitInfo info)
             throws IOException
Return the byte size of the provided SegmentCommitInfo, pro-rated by percentage of non-deleted documents is set.

Throws:
IOException

isMerged

protected final boolean isMerged(SegmentInfos infos,
                                 SegmentCommitInfo info)
                          throws IOException
Returns true if this single info is already fully merged (has no pending deletes, is in the same dir as the writer, and matches the current compound file setting

Throws:
IOException

getNoCFSRatio

public final double getNoCFSRatio()
Returns current noCFSRatio.

See Also:
setNoCFSRatio(double)

setNoCFSRatio

public final void setNoCFSRatio(double noCFSRatio)
If a merged segment will be more than this percentage of the total size of the index, leave the segment as non-compound file even if compound file is enabled. Set to 1.0 to always use CFS regardless of merge size.


getMaxCFSSegmentSizeMB

public final double getMaxCFSSegmentSizeMB()
Returns the largest size allowed for a compound file segment


setMaxCFSSegmentSizeMB

public final void setMaxCFSSegmentSizeMB(double v)
If a merged segment will be more than this value, leave the segment as non-compound file even if compound file is enabled. Set this to Double.POSITIVE_INFINITY (default) and noCFSRatio to 1.0 to always use CFS regardless of merge size.



Copyright © 2000-2014 Apache Software Foundation. All Rights Reserved.