Class LiveIndexWriterConfig

java.lang.Object
org.apache.lucene.index.LiveIndexWriterConfig
Direct Known Subclasses:
IndexWriterConfig

public class LiveIndexWriterConfig extends Object
Holds all the configuration used by IndexWriter with few setters for settings that can be changed on an IndexWriter instance "live".
Since:
4.0
  • Field Details

    • delPolicy

      protected volatile IndexDeletionPolicy delPolicy
      IndexDeletionPolicy controlling when commit points are deleted.
    • commit

      protected volatile IndexCommit commit
      IndexCommit that IndexWriter is opened on.
    • openMode

      protected volatile IndexWriterConfig.OpenMode openMode
    • createdVersionMajor

      protected int createdVersionMajor
      Compatibility version to use for this index.
    • similarity

      protected volatile Similarity similarity
      Similarity to use when encoding norms.
    • mergeScheduler

      protected volatile MergeScheduler mergeScheduler
      MergeScheduler to use for running merges.
    • codec

      protected volatile Codec codec
      Codec used to write new segments.
    • infoStream

      protected volatile InfoStream infoStream
      InfoStream for debugging messages.
    • mergePolicy

      protected volatile MergePolicy mergePolicy
      MergePolicy for selecting merges.
    • readerPooling

      protected volatile boolean readerPooling
      True if readers should be pooled.
    • flushPolicy

      protected volatile org.apache.lucene.index.FlushPolicy flushPolicy
      FlushPolicy to control when segments are flushed.
    • perThreadHardLimitMB

      protected volatile int perThreadHardLimitMB
      Sets the hard upper bound on RAM usage for a single segment, after which the segment is forced to flush.
    • useCompoundFile

      protected volatile boolean useCompoundFile
      True if segment flushes should use compound file format
    • commitOnClose

      protected boolean commitOnClose
      True if calls to IndexWriter.close() should first do a commit.
    • indexSort

      protected Sort indexSort
      The sort order to use to write merged segments.
    • leafSorter

      protected Comparator<LeafReader> leafSorter
      The comparator for sorting leaf readers.
    • indexSortFields

      protected Set<String> indexSortFields
      The field names involved in the index sort
    • checkPendingFlushOnUpdate

      protected volatile boolean checkPendingFlushOnUpdate
      if an indexing thread should check for pending flushes on update in order to help out on a full flush
    • softDeletesField

      protected String softDeletesField
      soft deletes field
    • maxFullFlushMergeWaitMillis

      protected volatile long maxFullFlushMergeWaitMillis
      Amount of time to wait for merges returned by MergePolicy.findFullFlushMerges(...)
    • eventListener

      protected IndexWriterEventListener eventListener
      The IndexWriter event listener to record key events *
  • Method Details

    • getAnalyzer

      public Analyzer getAnalyzer()
      Returns the default analyzer to use for indexing documents.
    • setRAMBufferSizeMB

      public LiveIndexWriterConfig setRAMBufferSizeMB(double ramBufferSizeMB)
      Determines the amount of RAM that may be used for buffering added documents and deletions before they are flushed to the Directory. Generally for faster indexing performance it's best to flush by RAM usage instead of document count and use as large a RAM buffer as you can.

      When this is set, the writer will flush whenever buffered documents and deletions use this much RAM. Pass in IndexWriterConfig.DISABLE_AUTO_FLUSH to prevent triggering a flush due to RAM usage. Note that if flushing by document count is also enabled, then the flush will be triggered by whichever comes first.

      The maximum RAM limit is inherently determined by the JVMs available memory. Yet, an IndexWriter session can consume a significantly larger amount of memory than the given RAM limit since this limit is just an indicator when to flush memory resident documents to the Directory. Flushes are likely happen concurrently while other threads adding documents to the writer. For application stability the available memory in the JVM should be significantly larger than the RAM buffer used for indexing.

      NOTE: the account of RAM usage for pending deletions is only approximate. Specifically, if you delete by Query, Lucene currently has no way to measure the RAM usage of individual Queries so the accounting will under-estimate and you should compensate by either calling commit() or refresh() periodically yourself.

      NOTE: It's not guaranteed that all memory resident documents are flushed once this limit is exceeded. Depending on the configured FlushPolicy only a subset of the buffered documents are flushed and therefore only parts of the RAM buffer is released.

      The default value is IndexWriterConfig.DEFAULT_RAM_BUFFER_SIZE_MB.

      Takes effect immediately, but only the next time a document is added, updated or deleted.

      Throws:
      IllegalArgumentException - if ramBufferSize is enabled but non-positive, or it disables ramBufferSize when maxBufferedDocs is already disabled
      See Also:
    • getRAMBufferSizeMB

      public double getRAMBufferSizeMB()
      Returns the value set by setRAMBufferSizeMB(double) if enabled.
    • setMaxBufferedDocs

      public LiveIndexWriterConfig setMaxBufferedDocs(int maxBufferedDocs)
      Determines the minimal number of documents required before the buffered in-memory documents are flushed as a new Segment. Large values generally give faster indexing.

      When this is set, the writer will flush every maxBufferedDocs added documents. Pass in IndexWriterConfig.DISABLE_AUTO_FLUSH to prevent triggering a flush due to number of buffered documents. Note that if flushing by RAM usage is also enabled, then the flush will be triggered by whichever comes first.

      Disabled by default (writer flushes by RAM usage).

      Takes effect immediately, but only the next time a document is added, updated or deleted.

      Throws:
      IllegalArgumentException - if maxBufferedDocs is enabled but smaller than 2, or it disables maxBufferedDocs when ramBufferSize is already disabled
      See Also:
    • getMaxBufferedDocs

      public int getMaxBufferedDocs()
      Returns the number of buffered added documents that will trigger a flush if enabled.
      See Also:
    • setMergePolicy

      public LiveIndexWriterConfig setMergePolicy(MergePolicy mergePolicy)
      Expert: MergePolicy is invoked whenever there are changes to the segments in the index. Its role is to select which merges to do, if any, and return a MergePolicy.MergeSpecification describing the merges. It also selects merges to do for forceMerge.

      Takes effect on subsequent merge selections. Any merges in flight or any merges already registered by the previous MergePolicy are not affected.

    • setMergedSegmentWarmer

      public LiveIndexWriterConfig setMergedSegmentWarmer(IndexWriter.IndexReaderWarmer mergeSegmentWarmer)
      Set the merged segment warmer. See IndexWriter.IndexReaderWarmer.

      Takes effect on the next merge.

    • getMergedSegmentWarmer

      public IndexWriter.IndexReaderWarmer getMergedSegmentWarmer()
      Returns the current merged segment warmer. See IndexWriter.IndexReaderWarmer.
    • getOpenMode

      public IndexWriterConfig.OpenMode getOpenMode()
    • getIndexCreatedVersionMajor

      public int getIndexCreatedVersionMajor()
      Return the compatibility version to use for this index.
      See Also:
    • getIndexDeletionPolicy

      public IndexDeletionPolicy getIndexDeletionPolicy()
    • getIndexCommit

      public IndexCommit getIndexCommit()
      Returns the IndexCommit as specified in IndexWriterConfig.setIndexCommit(IndexCommit) or the default, null which specifies to open the latest index commit point.
    • getSimilarity

      public Similarity getSimilarity()
      Expert: returns the Similarity implementation used by this IndexWriter.
    • getMergeScheduler

      public MergeScheduler getMergeScheduler()
    • getCodec

      public Codec getCodec()
      Returns the current Codec.
    • getMergePolicy

      public MergePolicy getMergePolicy()
      Returns the current MergePolicy in use by this writer.
      See Also:
    • getReaderPooling

      public boolean getReaderPooling()
      Returns true if IndexWriter should pool readers even if DirectoryReader.open(IndexWriter) has not been called.
    • getRAMPerThreadHardLimitMB

      public int getRAMPerThreadHardLimitMB()
      Returns the max amount of memory each DocumentsWriterPerThread can consume until forcefully flushed.
      See Also:
    • getInfoStream

      public InfoStream getInfoStream()
      Returns InfoStream used for debugging.
      See Also:
    • setUseCompoundFile

      public LiveIndexWriterConfig setUseCompoundFile(boolean useCompoundFile)
      Sets if the IndexWriter should pack newly written segments in a compound file. Default is true.

      Use false for batch indexing with very large ram buffer settings.

      Note: To control compound file usage during segment merges see MergePolicy.setNoCFSRatio(double) and MergePolicy.setMaxCFSSegmentSizeMB(double). This setting only applies to newly created segments.

    • getUseCompoundFile

      public boolean getUseCompoundFile()
      Returns true iff the IndexWriter packs newly written segments in a compound file. Default is true.
    • getCommitOnClose

      public boolean getCommitOnClose()
      Returns true if IndexWriter.close() should first commit before closing.
    • getIndexSort

      public Sort getIndexSort()
      Get the index-time Sort order, applied to all (flushed and merged) segments.
    • getIndexSortFields

      public Set<String> getIndexSortFields()
      Returns the field names involved in the index sort
    • getLeafSorter

      public Comparator<LeafReader> getLeafSorter()
      Returns a comparator for sorting leaf readers. If not null, this comparator is used to sort leaf readers within DirectoryReader opened from the IndexWriter of this configuration.
      Returns:
      a comparator for sorting leaf readers
    • isCheckPendingFlushOnUpdate

      public boolean isCheckPendingFlushOnUpdate()
      Expert: Returns if indexing threads check for pending flushes on update in order to help our flushing indexing buffers to disk
      WARNING: This API is experimental and might change in incompatible ways in the next release.
    • setCheckPendingFlushUpdate

      public LiveIndexWriterConfig setCheckPendingFlushUpdate(boolean checkPendingFlushOnUpdate)
      Expert: sets if indexing threads check for pending flushes on update in order to help our flushing indexing buffers to disk. As a consequence, threads calling DirectoryReader.openIfChanged(DirectoryReader, IndexWriter) or IndexWriter.flush() will be the only thread writing segments to disk unless flushes are falling behind. If indexing is stalled due to too many pending flushes indexing threads will help our writing pending segment flushes to disk.
      WARNING: This API is experimental and might change in incompatible ways in the next release.
    • getSoftDeletesField

      public String getSoftDeletesField()
      Returns the soft deletes field or null if soft-deletes are disabled. See IndexWriterConfig.setSoftDeletesField(String) for details.
    • getMaxFullFlushMergeWaitMillis

      public long getMaxFullFlushMergeWaitMillis()
      Expert: return the amount of time to wait for merges returned by by MergePolicy.findFullFlushMerges(...). If this time is reached, we proceed with the commit based on segments merged up to that point. The merges are not cancelled, and may still run to completion independent of the commit.
    • getIndexWriterEventListener

      public IndexWriterEventListener getIndexWriterEventListener()
      Returns the IndexWriterEventListener callback that tracks the key IndexWriter operations.
    • toString

      public String toString()
      Overrides:
      toString in class Object