Class Lucene50CompressingTermVectorsFormat

  • Direct Known Subclasses:
    Lucene50TermVectorsFormat

    public class Lucene50CompressingTermVectorsFormat
    extends TermVectorsFormat
    A TermVectorsFormat that compresses chunks of documents together in order to improve the compression ratio.
    WARNING: This API is experimental and might change in incompatible ways in the next release.
    • Field Detail

      • formatName

        protected final String formatName
        format name
      • segmentSuffix

        protected final String segmentSuffix
        segment suffix
      • compressionMode

        protected final CompressionMode compressionMode
        compression mode
      • chunkSize

        protected final int chunkSize
        chunk size
      • blockSize

        protected final int blockSize
        block size
      • maxDocsPerChunk

        protected final int maxDocsPerChunk
        max docs per chunk
    • Constructor Detail

      • Lucene50CompressingTermVectorsFormat

        public Lucene50CompressingTermVectorsFormat​(String formatName,
                                                    String segmentSuffix,
                                                    CompressionMode compressionMode,
                                                    int chunkSize,
                                                    int maxDocsPerChunk,
                                                    int blockSize)
        Create a new Lucene50CompressingTermVectorsFormat.

        formatName is the name of the format. This name will be used in the file formats to perform codec header checks.

        The compressionMode parameter allows you to choose between compression algorithms that have various compression and decompression speeds so that you can pick the one that best fits your indexing and searching throughput. You should never instantiate two Lucene50CompressingTermVectorsFormats that have the same name but different CompressionModes.

        chunkSize is the minimum byte size of a chunk of documents. Higher values of chunkSize should improve the compression ratio but will require more memory at indexing time and might make document loading a little slower (depending on the size of your OS cache compared to the size of your index).

        Parameters:
        formatName - the name of the StoredFieldsFormat
        segmentSuffix - a suffix to append to files created by this format
        compressionMode - the CompressionMode to use
        chunkSize - the minimum number of bytes of a single chunk of stored documents
        maxDocsPerChunk - the maximum number of documents in a single chunk
        blockSize - the number of chunks to store in an index block.
        See Also:
        CompressionMode