Class ConcurrentMergeScheduler
- java.lang.Object
-
- org.apache.lucene.index.MergeScheduler
-
- org.apache.lucene.index.ConcurrentMergeScheduler
-
- All Implemented Interfaces:
Closeable
,AutoCloseable
public class ConcurrentMergeScheduler extends MergeScheduler
AMergeScheduler
that runs each merge using a separate thread.Specify the max number of threads that may run at once, and the maximum number of simultaneous merges with
setMaxMergesAndThreads(int, int)
.If the number of merges exceeds the max number of threads then the largest merges are paused until one of the smaller merges completes.
If more than
getMaxMergeCount()
merges are requested then this class will forcefully throttle the incoming threads by pausing until one more merges complete.This class sets defaults based on Java's view of the cpu count, and it assumes a solid state disk (or similar). If you have a spinning disk and want to maximize performance, use
setDefaultMaxMergesAndThreads(boolean)
.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected class
ConcurrentMergeScheduler.MergeThread
Runs a merge thread to execute a single merge, then exits.-
Nested classes/interfaces inherited from class org.apache.lucene.index.MergeScheduler
MergeScheduler.MergeSource
-
-
Field Summary
Fields Modifier and Type Field Description static int
AUTO_DETECT_MERGES_AND_THREADS
Dynamic default formaxThreadCount
andmaxMergeCount
, based on CPU core count.static String
DEFAULT_CPU_CORE_COUNT_PROPERTY
Used for testing.protected org.apache.lucene.index.ConcurrentMergeScheduler.CachedExecutor
intraMergeExecutor
The executor provided for intra-merge parallelizationprotected int
mergeThreadCount
How manyConcurrentMergeScheduler.MergeThread
s have kicked off (this is use to name them).protected List<ConcurrentMergeScheduler.MergeThread>
mergeThreads
List of currently activeConcurrentMergeScheduler.MergeThread
s.protected double
targetMBPerSec
Current IO writes throttle rate-
Fields inherited from class org.apache.lucene.index.MergeScheduler
infoStream
-
-
Constructor Summary
Constructors Constructor Description ConcurrentMergeScheduler()
Sole constructor, with all settings set to default values.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
close()
Close this MergeScheduler.void
disableAutoIOThrottle()
Turn off auto IO throttling.protected void
doMerge(MergeScheduler.MergeSource mergeSource, MergePolicy.OneMerge merge)
Does the actual merge, by callingMergeScheduler.MergeSource.merge(org.apache.lucene.index.MergePolicy.OneMerge)
protected void
doStall()
Called frommaybeStall(org.apache.lucene.index.MergeScheduler.MergeSource)
to pause the calling thread for a bit.void
enableAutoIOThrottle()
Turn on dynamic IO throttling, to adaptively rate limit writes bytes/sec to the minimal rate necessary so merges do not fall behind.boolean
getAutoIOThrottle()
Returns true if auto IO throttling is currently enabled.double
getForceMergeMBPerSec()
Get the per-merge IO throttle rate for forced merges.Executor
getIntraMergeExecutor(MergePolicy.OneMerge merge)
Provides an executor for parallelism during a single merge operation.double
getIORateLimitMBPerSec()
Returns the currently set per-merge IO writes rate limit, ifenableAutoIOThrottle()
was called, elseDouble.POSITIVE_INFINITY
.int
getMaxMergeCount()
int
getMaxThreadCount()
ReturnsmaxThreadCount
.protected ConcurrentMergeScheduler.MergeThread
getMergeThread(MergeScheduler.MergeSource mergeSource, MergePolicy.OneMerge merge)
Create and return a new MergeThreadprotected void
handleMergeException(Throwable exc)
Called when an exception is hit in a background merge threadprotected boolean
maybeStall(MergeScheduler.MergeSource mergeSource)
This is invoked bymerge(org.apache.lucene.index.MergeScheduler.MergeSource, org.apache.lucene.index.MergeTrigger)
to possibly stall the incoming thread when there are too many merges running or pending.void
merge(MergeScheduler.MergeSource mergeSource, MergeTrigger trigger)
Run the merges provided byMergeScheduler.MergeSource.getNextMerge()
.int
mergeThreadCount()
Returns the number of merge threads that are alive, ignoring the calling thread if it is a merge thread.void
setDefaultMaxMergesAndThreads(boolean spins)
Sets max merges and threads to proper defaults for rotational or non-rotational storage.void
setForceMergeMBPerSec(double v)
Set the per-merge IO throttle rate for forced merges (default:Double.POSITIVE_INFINITY
).void
setMaxMergesAndThreads(int maxMergeCount, int maxThreadCount)
Expert: directly set the maximum number of merge threads and simultaneous merges allowed.void
sync()
Wait for any running merge threads to finish.protected void
targetMBPerSecChanged()
Subclass can override to tweak targetMBPerSec.String
toString()
protected void
updateMergeThreads()
Called whenever the running merges have changed, to set merge IO limits.Directory
wrapForMerge(MergePolicy.OneMerge merge, Directory in)
Wraps the incomingDirectory
so that we can merge-throttle it usingRateLimitedIndexOutput
.-
Methods inherited from class org.apache.lucene.index.MergeScheduler
message, verbose
-
-
-
-
Field Detail
-
AUTO_DETECT_MERGES_AND_THREADS
public static final int AUTO_DETECT_MERGES_AND_THREADS
Dynamic default formaxThreadCount
andmaxMergeCount
, based on CPU core count.maxThreadCount
is set tomax(1, min(4, cpuCoreCount/2))
.maxMergeCount
is set tomaxThreadCount + 5
.- See Also:
- Constant Field Values
-
DEFAULT_CPU_CORE_COUNT_PROPERTY
public static final String DEFAULT_CPU_CORE_COUNT_PROPERTY
Used for testing.- See Also:
- Constant Field Values
- NOTE: This API is for internal purposes only and might change in incompatible ways in the next release.
-
mergeThreads
protected final List<ConcurrentMergeScheduler.MergeThread> mergeThreads
List of currently activeConcurrentMergeScheduler.MergeThread
s.
-
mergeThreadCount
protected int mergeThreadCount
How manyConcurrentMergeScheduler.MergeThread
s have kicked off (this is use to name them).
-
targetMBPerSec
protected double targetMBPerSec
Current IO writes throttle rate
-
intraMergeExecutor
protected org.apache.lucene.index.ConcurrentMergeScheduler.CachedExecutor intraMergeExecutor
The executor provided for intra-merge parallelization
-
-
Method Detail
-
setMaxMergesAndThreads
public void setMaxMergesAndThreads(int maxMergeCount, int maxThreadCount)
Expert: directly set the maximum number of merge threads and simultaneous merges allowed.- Parameters:
maxMergeCount
- the max # simultaneous merges that are allowed. If a merge is necessary yet we already have this many threads running, the incoming thread (that is calling add/updateDocument) will block until a merge thread has completed. Note that we will only run the smallestmaxThreadCount
merges at a time.maxThreadCount
- the max # simultaneous merge threads that should be running at once. This must be <=maxMergeCount
-
setDefaultMaxMergesAndThreads
public void setDefaultMaxMergesAndThreads(boolean spins)
Sets max merges and threads to proper defaults for rotational or non-rotational storage.- Parameters:
spins
- true to set defaults best for traditional rotatational storage (spinning disks), else false (e.g. for solid-state disks)
-
setForceMergeMBPerSec
public void setForceMergeMBPerSec(double v)
Set the per-merge IO throttle rate for forced merges (default:Double.POSITIVE_INFINITY
).
-
getForceMergeMBPerSec
public double getForceMergeMBPerSec()
Get the per-merge IO throttle rate for forced merges.
-
enableAutoIOThrottle
public void enableAutoIOThrottle()
Turn on dynamic IO throttling, to adaptively rate limit writes bytes/sec to the minimal rate necessary so merges do not fall behind. By default this is enabled.
-
disableAutoIOThrottle
public void disableAutoIOThrottle()
Turn off auto IO throttling.- See Also:
enableAutoIOThrottle()
-
getAutoIOThrottle
public boolean getAutoIOThrottle()
Returns true if auto IO throttling is currently enabled.
-
getIORateLimitMBPerSec
public double getIORateLimitMBPerSec()
Returns the currently set per-merge IO writes rate limit, ifenableAutoIOThrottle()
was called, elseDouble.POSITIVE_INFINITY
.
-
getMaxThreadCount
public int getMaxThreadCount()
ReturnsmaxThreadCount
.- See Also:
setMaxMergesAndThreads(int, int)
-
getMaxMergeCount
public int getMaxMergeCount()
-
getIntraMergeExecutor
public Executor getIntraMergeExecutor(MergePolicy.OneMerge merge)
Description copied from class:MergeScheduler
Provides an executor for parallelism during a single merge operation. By default, the method returns aSameThreadExecutorService
where all intra-merge actions occur in their calling thread.- Overrides:
getIntraMergeExecutor
in classMergeScheduler
-
wrapForMerge
public Directory wrapForMerge(MergePolicy.OneMerge merge, Directory in)
Description copied from class:MergeScheduler
Wraps the incomingDirectory
so that we can merge-throttle it usingRateLimitedIndexOutput
.- Overrides:
wrapForMerge
in classMergeScheduler
-
updateMergeThreads
protected void updateMergeThreads()
Called whenever the running merges have changed, to set merge IO limits. This method sorts the merge threads by their merge size in descending order and then pauses/unpauses threads from first to last -- that way, smaller merges are guaranteed to run before larger ones.
-
close
public void close() throws IOException
Description copied from class:MergeScheduler
Close this MergeScheduler.- Specified by:
close
in interfaceAutoCloseable
- Specified by:
close
in interfaceCloseable
- Overrides:
close
in classMergeScheduler
- Throws:
IOException
-
sync
public void sync()
Wait for any running merge threads to finish. This call is not interruptible as used byclose()
.
-
mergeThreadCount
public int mergeThreadCount()
Returns the number of merge threads that are alive, ignoring the calling thread if it is a merge thread. Note that this number is ≤mergeThreads
size.- NOTE: This API is for internal purposes only and might change in incompatible ways in the next release.
-
merge
public void merge(MergeScheduler.MergeSource mergeSource, MergeTrigger trigger) throws IOException
Description copied from class:MergeScheduler
Run the merges provided byMergeScheduler.MergeSource.getNextMerge()
.- Specified by:
merge
in classMergeScheduler
- Parameters:
mergeSource
- theIndexWriter
to obtain the merges from.trigger
- theMergeTrigger
that caused this merge to happen- Throws:
IOException
-
maybeStall
protected boolean maybeStall(MergeScheduler.MergeSource mergeSource)
This is invoked bymerge(org.apache.lucene.index.MergeScheduler.MergeSource, org.apache.lucene.index.MergeTrigger)
to possibly stall the incoming thread when there are too many merges running or pending. The default behavior is to force this thread, which is producing too many segments for merging to keep up, to wait until merges catch up. Applications that can take other less drastic measures, such as limiting how many threads are allowed to index, can do nothing here and throttle elsewhere.If this method wants to stall but the calling thread is a merge thread, it should return false to tell caller not to kick off any new merges.
-
doStall
protected void doStall()
Called frommaybeStall(org.apache.lucene.index.MergeScheduler.MergeSource)
to pause the calling thread for a bit.
-
doMerge
protected void doMerge(MergeScheduler.MergeSource mergeSource, MergePolicy.OneMerge merge) throws IOException
Does the actual merge, by callingMergeScheduler.MergeSource.merge(org.apache.lucene.index.MergePolicy.OneMerge)
- Throws:
IOException
-
getMergeThread
protected ConcurrentMergeScheduler.MergeThread getMergeThread(MergeScheduler.MergeSource mergeSource, MergePolicy.OneMerge merge) throws IOException
Create and return a new MergeThread- Throws:
IOException
-
handleMergeException
protected void handleMergeException(Throwable exc)
Called when an exception is hit in a background merge thread
-
targetMBPerSecChanged
protected void targetMBPerSecChanged()
Subclass can override to tweak targetMBPerSec.
-
-