Class LiveIndexWriterConfig
- java.lang.Object
-
- org.apache.lucene.index.LiveIndexWriterConfig
-
- Direct Known Subclasses:
IndexWriterConfig
public class LiveIndexWriterConfig extends Object
Holds all the configuration used byIndexWriter
with few setters for settings that can be changed on anIndexWriter
instance "live".- Since:
- 4.0
-
-
Field Summary
Fields Modifier and Type Field Description protected boolean
checkPendingFlushOnUpdate
if an indexing thread should check for pending flushes on update in order to help out on a full flushprotected Codec
codec
Codec
used to write new segments.protected IndexCommit
commit
IndexCommit
thatIndexWriter
is opened on.protected boolean
commitOnClose
True if calls toIndexWriter.close()
should first do a commit.protected int
createdVersionMajor
Compatibility version to use for this index.protected IndexDeletionPolicy
delPolicy
IndexDeletionPolicy
controlling when commit points are deleted.protected IndexWriterEventListener
eventListener
The IndexWriter event listener to record key events *protected org.apache.lucene.index.FlushPolicy
flushPolicy
FlushPolicy
to control when segments are flushed.protected Sort
indexSort
The sort order to use to write merged segments.protected Set<String>
indexSortFields
The field names involved in the index sortprotected InfoStream
infoStream
InfoStream
for debugging messages.protected Comparator<LeafReader>
leafSorter
The comparator for sorting leaf readers.protected long
maxFullFlushMergeWaitMillis
Amount of time to wait for merges returned by MergePolicy.findFullFlushMerges(...)protected MergePolicy
mergePolicy
MergePolicy
for selecting merges.protected MergeScheduler
mergeScheduler
MergeScheduler
to use for running merges.protected IndexWriterConfig.OpenMode
openMode
IndexWriterConfig.OpenMode
thatIndexWriter
is opened with.protected String
parentField
parent document fieldprotected int
perThreadHardLimitMB
Sets the hard upper bound on RAM usage for a single segment, after which the segment is forced to flush.protected boolean
readerPooling
True if readers should be pooled.protected Similarity
similarity
Similarity
to use when encoding norms.protected String
softDeletesField
soft deletes fieldprotected boolean
useCompoundFile
True if segment flushes should use compound file format
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Analyzer
getAnalyzer()
Returns the default analyzer to use for indexing documents.Codec
getCodec()
Returns the currentCodec
.boolean
getCommitOnClose()
Returnstrue
ifIndexWriter.close()
should first commit before closing.IndexCommit
getIndexCommit()
Returns theIndexCommit
as specified inIndexWriterConfig.setIndexCommit(IndexCommit)
or the default,null
which specifies to open the latest index commit point.int
getIndexCreatedVersionMajor()
Return the compatibility version to use for this index.IndexDeletionPolicy
getIndexDeletionPolicy()
Returns theIndexDeletionPolicy
specified inIndexWriterConfig.setIndexDeletionPolicy(IndexDeletionPolicy)
or the defaultKeepOnlyLastCommitDeletionPolicy
/Sort
getIndexSort()
Get the index-timeSort
order, applied to all (flushed and merged) segments.Set<String>
getIndexSortFields()
Returns the field names involved in the index sortIndexWriterEventListener
getIndexWriterEventListener()
Returns the IndexWriterEventListener callback that tracks the key IndexWriter operations.InfoStream
getInfoStream()
ReturnsInfoStream
used for debugging.Comparator<LeafReader>
getLeafSorter()
Returns a comparator for sorting leaf readers.int
getMaxBufferedDocs()
Returns the number of buffered added documents that will trigger a flush if enabled.long
getMaxFullFlushMergeWaitMillis()
Expert: return the amount of time to wait for merges returned by by MergePolicy.findFullFlushMerges(...).IndexWriter.IndexReaderWarmer
getMergedSegmentWarmer()
Returns the current merged segment warmer.MergePolicy
getMergePolicy()
Returns the current MergePolicy in use by this writer.MergeScheduler
getMergeScheduler()
Returns theMergeScheduler
that was set byIndexWriterConfig.setMergeScheduler(MergeScheduler)
.IndexWriterConfig.OpenMode
getOpenMode()
Returns theIndexWriterConfig.OpenMode
set byIndexWriterConfig.setOpenMode(OpenMode)
.String
getParentField()
Returns the parent document field name if configured.double
getRAMBufferSizeMB()
Returns the value set bysetRAMBufferSizeMB(double)
if enabled.int
getRAMPerThreadHardLimitMB()
Returns the max amount of memory eachDocumentsWriterPerThread
can consume until forcefully flushed.boolean
getReaderPooling()
Returnstrue
ifIndexWriter
should pool readers even ifDirectoryReader.open(IndexWriter)
has not been called.Similarity
getSimilarity()
Expert: returns theSimilarity
implementation used by thisIndexWriter
.String
getSoftDeletesField()
Returns the soft deletes field ornull
if soft-deletes are disabled.boolean
getUseCompoundFile()
Returnstrue
iff theIndexWriter
packs newly written segments in a compound file.boolean
isCheckPendingFlushOnUpdate()
Expert: Returns if indexing threads check for pending flushes on update in order to help our flushing indexing buffers to diskLiveIndexWriterConfig
setCheckPendingFlushUpdate(boolean checkPendingFlushOnUpdate)
Expert: sets if indexing threads check for pending flushes on update in order to help our flushing indexing buffers to disk.LiveIndexWriterConfig
setMaxBufferedDocs(int maxBufferedDocs)
Determines the minimal number of documents required before the buffered in-memory documents are flushed as a new Segment.LiveIndexWriterConfig
setMergedSegmentWarmer(IndexWriter.IndexReaderWarmer mergeSegmentWarmer)
Set the merged segment warmer.LiveIndexWriterConfig
setMergePolicy(MergePolicy mergePolicy)
Expert:MergePolicy
is invoked whenever there are changes to the segments in the index.LiveIndexWriterConfig
setRAMBufferSizeMB(double ramBufferSizeMB)
Determines the amount of RAM that may be used for buffering added documents and deletions before they are flushed to the Directory.LiveIndexWriterConfig
setUseCompoundFile(boolean useCompoundFile)
Sets if theIndexWriter
should pack newly written segments in a compound file.String
toString()
-
-
-
Field Detail
-
delPolicy
protected volatile IndexDeletionPolicy delPolicy
IndexDeletionPolicy
controlling when commit points are deleted.
-
commit
protected volatile IndexCommit commit
IndexCommit
thatIndexWriter
is opened on.
-
openMode
protected volatile IndexWriterConfig.OpenMode openMode
IndexWriterConfig.OpenMode
thatIndexWriter
is opened with.
-
createdVersionMajor
protected int createdVersionMajor
Compatibility version to use for this index.
-
similarity
protected volatile Similarity similarity
Similarity
to use when encoding norms.
-
mergeScheduler
protected volatile MergeScheduler mergeScheduler
MergeScheduler
to use for running merges.
-
infoStream
protected volatile InfoStream infoStream
InfoStream
for debugging messages.
-
mergePolicy
protected volatile MergePolicy mergePolicy
MergePolicy
for selecting merges.
-
readerPooling
protected volatile boolean readerPooling
True if readers should be pooled.
-
flushPolicy
protected volatile org.apache.lucene.index.FlushPolicy flushPolicy
FlushPolicy
to control when segments are flushed.
-
perThreadHardLimitMB
protected volatile int perThreadHardLimitMB
Sets the hard upper bound on RAM usage for a single segment, after which the segment is forced to flush.
-
useCompoundFile
protected volatile boolean useCompoundFile
True if segment flushes should use compound file format
-
commitOnClose
protected boolean commitOnClose
True if calls toIndexWriter.close()
should first do a commit.
-
indexSort
protected Sort indexSort
The sort order to use to write merged segments.
-
leafSorter
protected Comparator<LeafReader> leafSorter
The comparator for sorting leaf readers.
-
parentField
protected String parentField
parent document field
-
checkPendingFlushOnUpdate
protected volatile boolean checkPendingFlushOnUpdate
if an indexing thread should check for pending flushes on update in order to help out on a full flush
-
softDeletesField
protected String softDeletesField
soft deletes field
-
maxFullFlushMergeWaitMillis
protected volatile long maxFullFlushMergeWaitMillis
Amount of time to wait for merges returned by MergePolicy.findFullFlushMerges(...)
-
eventListener
protected IndexWriterEventListener eventListener
The IndexWriter event listener to record key events *
-
-
Method Detail
-
getAnalyzer
public Analyzer getAnalyzer()
Returns the default analyzer to use for indexing documents.
-
setRAMBufferSizeMB
public LiveIndexWriterConfig setRAMBufferSizeMB(double ramBufferSizeMB)
Determines the amount of RAM that may be used for buffering added documents and deletions before they are flushed to the Directory. Generally for faster indexing performance it's best to flush by RAM usage instead of document count and use as large a RAM buffer as you can.When this is set, the writer will flush whenever buffered documents and deletions use this much RAM. Pass in
IndexWriterConfig.DISABLE_AUTO_FLUSH
to prevent triggering a flush due to RAM usage. Note that if flushing by document count is also enabled, then the flush will be triggered by whichever comes first.The maximum RAM limit is inherently determined by the JVMs available memory. Yet, an
IndexWriter
session can consume a significantly larger amount of memory than the given RAM limit since this limit is just an indicator when to flush memory resident documents to the Directory. Flushes are likely happen concurrently while other threads adding documents to the writer. For application stability the available memory in the JVM should be significantly larger than the RAM buffer used for indexing.NOTE: the account of RAM usage for pending deletions is only approximate. Specifically, if you delete by Query, Lucene currently has no way to measure the RAM usage of individual Queries so the accounting will under-estimate and you should compensate by either calling commit() or refresh() periodically yourself.
NOTE: It's not guaranteed that all memory resident documents are flushed once this limit is exceeded. Depending on the configured
FlushPolicy
only a subset of the buffered documents are flushed and therefore only parts of the RAM buffer is released.The default value is
IndexWriterConfig.DEFAULT_RAM_BUFFER_SIZE_MB
.Takes effect immediately, but only the next time a document is added, updated or deleted.
- Throws:
IllegalArgumentException
- if ramBufferSize is enabled but non-positive, or it disables ramBufferSize when maxBufferedDocs is already disabled- See Also:
IndexWriterConfig.setRAMPerThreadHardLimitMB(int)
-
getRAMBufferSizeMB
public double getRAMBufferSizeMB()
Returns the value set bysetRAMBufferSizeMB(double)
if enabled.
-
setMaxBufferedDocs
public LiveIndexWriterConfig setMaxBufferedDocs(int maxBufferedDocs)
Determines the minimal number of documents required before the buffered in-memory documents are flushed as a new Segment. Large values generally give faster indexing.When this is set, the writer will flush every maxBufferedDocs added documents. Pass in
IndexWriterConfig.DISABLE_AUTO_FLUSH
to prevent triggering a flush due to number of buffered documents. Note that if flushing by RAM usage is also enabled, then the flush will be triggered by whichever comes first.Disabled by default (writer flushes by RAM usage).
Takes effect immediately, but only the next time a document is added, updated or deleted.
- Throws:
IllegalArgumentException
- if maxBufferedDocs is enabled but smaller than 2, or it disables maxBufferedDocs when ramBufferSize is already disabled- See Also:
setRAMBufferSizeMB(double)
-
getMaxBufferedDocs
public int getMaxBufferedDocs()
Returns the number of buffered added documents that will trigger a flush if enabled.- See Also:
setMaxBufferedDocs(int)
-
setMergePolicy
public LiveIndexWriterConfig setMergePolicy(MergePolicy mergePolicy)
Expert:MergePolicy
is invoked whenever there are changes to the segments in the index. Its role is to select which merges to do, if any, and return aMergePolicy.MergeSpecification
describing the merges. It also selects merges to do for forceMerge.Takes effect on subsequent merge selections. Any merges in flight or any merges already registered by the previous
MergePolicy
are not affected.
-
setMergedSegmentWarmer
public LiveIndexWriterConfig setMergedSegmentWarmer(IndexWriter.IndexReaderWarmer mergeSegmentWarmer)
Set the merged segment warmer. SeeIndexWriter.IndexReaderWarmer
.Takes effect on the next merge.
-
getMergedSegmentWarmer
public IndexWriter.IndexReaderWarmer getMergedSegmentWarmer()
Returns the current merged segment warmer. SeeIndexWriter.IndexReaderWarmer
.
-
getOpenMode
public IndexWriterConfig.OpenMode getOpenMode()
Returns theIndexWriterConfig.OpenMode
set byIndexWriterConfig.setOpenMode(OpenMode)
.
-
getIndexCreatedVersionMajor
public int getIndexCreatedVersionMajor()
Return the compatibility version to use for this index.
-
getIndexDeletionPolicy
public IndexDeletionPolicy getIndexDeletionPolicy()
Returns theIndexDeletionPolicy
specified inIndexWriterConfig.setIndexDeletionPolicy(IndexDeletionPolicy)
or the defaultKeepOnlyLastCommitDeletionPolicy
/
-
getIndexCommit
public IndexCommit getIndexCommit()
Returns theIndexCommit
as specified inIndexWriterConfig.setIndexCommit(IndexCommit)
or the default,null
which specifies to open the latest index commit point.
-
getSimilarity
public Similarity getSimilarity()
Expert: returns theSimilarity
implementation used by thisIndexWriter
.
-
getMergeScheduler
public MergeScheduler getMergeScheduler()
Returns theMergeScheduler
that was set byIndexWriterConfig.setMergeScheduler(MergeScheduler)
.
-
getMergePolicy
public MergePolicy getMergePolicy()
Returns the current MergePolicy in use by this writer.
-
getReaderPooling
public boolean getReaderPooling()
Returnstrue
ifIndexWriter
should pool readers even ifDirectoryReader.open(IndexWriter)
has not been called.
-
getRAMPerThreadHardLimitMB
public int getRAMPerThreadHardLimitMB()
Returns the max amount of memory eachDocumentsWriterPerThread
can consume until forcefully flushed.
-
getInfoStream
public InfoStream getInfoStream()
ReturnsInfoStream
used for debugging.
-
setUseCompoundFile
public LiveIndexWriterConfig setUseCompoundFile(boolean useCompoundFile)
Sets if theIndexWriter
should pack newly written segments in a compound file. Default istrue
.Use
false
for batch indexing with very large ram buffer settings.Note: To control compound file usage during segment merges see
MergePolicy.setNoCFSRatio(double)
andMergePolicy.setMaxCFSSegmentSizeMB(double)
. This setting only applies to newly created segments.
-
getUseCompoundFile
public boolean getUseCompoundFile()
-
getCommitOnClose
public boolean getCommitOnClose()
Returnstrue
ifIndexWriter.close()
should first commit before closing.
-
getIndexSort
public Sort getIndexSort()
Get the index-timeSort
order, applied to all (flushed and merged) segments.
-
getIndexSortFields
public Set<String> getIndexSortFields()
Returns the field names involved in the index sort
-
getLeafSorter
public Comparator<LeafReader> getLeafSorter()
Returns a comparator for sorting leaf readers. If notnull
, this comparator is used to sort leaf readers withinDirectoryReader
opened from theIndexWriter
of this configuration.- Returns:
- a comparator for sorting leaf readers
-
isCheckPendingFlushOnUpdate
public boolean isCheckPendingFlushOnUpdate()
Expert: Returns if indexing threads check for pending flushes on update in order to help our flushing indexing buffers to disk- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
setCheckPendingFlushUpdate
public LiveIndexWriterConfig setCheckPendingFlushUpdate(boolean checkPendingFlushOnUpdate)
Expert: sets if indexing threads check for pending flushes on update in order to help our flushing indexing buffers to disk. As a consequence, threads callingDirectoryReader.openIfChanged(DirectoryReader, IndexWriter)
orIndexWriter.flush()
will be the only thread writing segments to disk unless flushes are falling behind. If indexing is stalled due to too many pending flushes indexing threads will help our writing pending segment flushes to disk.- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
getSoftDeletesField
public String getSoftDeletesField()
Returns the soft deletes field ornull
if soft-deletes are disabled. SeeIndexWriterConfig.setSoftDeletesField(String)
for details.
-
getMaxFullFlushMergeWaitMillis
public long getMaxFullFlushMergeWaitMillis()
Expert: return the amount of time to wait for merges returned by by MergePolicy.findFullFlushMerges(...). If this time is reached, we proceed with the commit based on segments merged up to that point. The merges are not cancelled, and may still run to completion independent of the commit.
-
getIndexWriterEventListener
public IndexWriterEventListener getIndexWriterEventListener()
Returns the IndexWriterEventListener callback that tracks the key IndexWriter operations.
-
getParentField
public String getParentField()
Returns the parent document field name if configured.
-
-