java.lang.Object

org.apache.lucene.codecs.StoredFieldsFormat

org.apache.lucene.backward_codecs.lucene50.compressing.Lucene50CompressingStoredFieldsFormat

public class Lucene50CompressingStoredFieldsFormat extends StoredFieldsFormat

A StoredFieldsFormat that compresses documents in chunks in order to improve the compression ratio.

For a chunk size of chunkSize bytes, this StoredFieldsFormat does not support documents larger than (2³¹ - chunkSize) bytes.

For optimal performance, you should use a MergePolicy that returns segments that have the biggest byte size first.

WARNING: This API is experimental and might change in incompatible ways in the next release.

Field Summary

Fields

Modifier and Type

Field

Description

protected final int

blockShift

block shift

protected final int

chunkSize

chunk size

protected final CompressionMode

compressionMode

compression mode

protected final String

formatName

format name

protected final int

maxDocsPerChunk

max docs per chunk

protected final String

segmentSuffix

segment suffix
Constructor Summary

Constructors

Constructor

Description

Lucene50CompressingStoredFieldsFormat(String formatName, String segmentSuffix, CompressionMode compressionMode, int chunkSize, int maxDocsPerChunk, int blockShift)

Create a new Lucene50CompressingStoredFieldsFormat.

Lucene50CompressingStoredFieldsFormat(String formatName, CompressionMode compressionMode, int chunkSize, int maxDocsPerChunk, int blockShift)

Create a new Lucene50CompressingStoredFieldsFormat with an empty segment suffix.
Method Summary

Modifier and Type

Method

Description

StoredFieldsReader

fieldsReader(Directory directory, SegmentInfo si, FieldInfos fn, IOContext context)

StoredFieldsWriter

fieldsWriter(Directory directory, SegmentInfo si, IOContext context)

String

toString()

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

Field Details
- formatName
  
  protected final String formatName
  
  format name
- segmentSuffix
  
  protected final String segmentSuffix
  
  segment suffix
- compressionMode
  
  protected final CompressionMode compressionMode
  
  compression mode
- chunkSize
  
  protected final int chunkSize
  
  chunk size
- maxDocsPerChunk
  
  protected final int maxDocsPerChunk
  
  max docs per chunk
- blockShift
  
  protected final int blockShift
  
  block shift
Constructor Details
- Lucene50CompressingStoredFieldsFormat
  
  public Lucene50CompressingStoredFieldsFormat(String formatName, CompressionMode compressionMode, int chunkSize, int maxDocsPerChunk, int blockShift)
  
  Create a new Lucene50CompressingStoredFieldsFormat with an empty segment suffix.
  See Also:
  
  Lucene50CompressingStoredFieldsFormat(String, String, CompressionMode, int, int, int)
- Lucene50CompressingStoredFieldsFormat
  
  public Lucene50CompressingStoredFieldsFormat(String formatName, String segmentSuffix, CompressionMode compressionMode, int chunkSize, int maxDocsPerChunk, int blockShift)
  
  Create a new Lucene50CompressingStoredFieldsFormat.
  formatName is the name of the format. This name will be used in the file formats to perform codec header checks.
  segmentSuffix is the segment suffix. This suffix is added to the result file name only if it's not the empty string.
  The compressionMode parameter allows you to choose between compression algorithms that have various compression and decompression speeds so that you can pick the one that best fits your indexing and searching throughput. You should never instantiate two Lucene50CompressingStoredFieldsFormats that have the same name but different CompressionModes.
  chunkSize is the minimum byte size of a chunk of documents. A value of 1 can make sense if there is redundancy across fields. maxDocsPerChunk is an upperbound on how many docs may be stored in a single chunk. This is to bound the cpu costs for highly compressible data.
  Higher values of chunkSize should improve the compression ratio but will require more memory at indexing time and might make document loading a little slower (depending on the size of your OS cache compared to the size of your index).
  Parameters:
  
  formatName - the name of the StoredFieldsFormat
  
  compressionMode - the CompressionMode to use
  
  chunkSize - the minimum number of bytes of a single chunk of stored documents
  
  maxDocsPerChunk - the maximum number of documents in a single chunk
  
  blockShift - the log in base 2 of number of chunks to store in an index block
  
  See Also:
  
  CompressionMode
Method Details
- fieldsReader
  
  public StoredFieldsReader fieldsReader(Directory directory, SegmentInfo si, FieldInfos fn, IOContext context) throws IOException
  
  Specified by:
  
  fieldsReader in class StoredFieldsFormat
  
  Throws:
  
  IOException
- fieldsWriter
  
  public StoredFieldsWriter fieldsWriter(Directory directory, SegmentInfo si, IOContext context) throws IOException
  
  Specified by:
  
  fieldsWriter in class StoredFieldsFormat
  
  Throws:
  
  IOException
- toString
  
  public String toString()
  
  Overrides:
  
  toString in class Object

Class Lucene50CompressingStoredFieldsFormat

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Details

formatName

segmentSuffix

compressionMode

chunkSize

maxDocsPerChunk

blockShift

Constructor Details

Lucene50CompressingStoredFieldsFormat

Lucene50CompressingStoredFieldsFormat

Method Details

fieldsReader

fieldsWriter

toString