org.apache.lucene.codecs.compressing
Class CompressingTermVectorsFormat

java.lang.Object
  extended by org.apache.lucene.codecs.TermVectorsFormat
      extended by org.apache.lucene.codecs.compressing.CompressingTermVectorsFormat
Direct Known Subclasses:
Lucene42TermVectorsFormat

public class CompressingTermVectorsFormat
extends TermVectorsFormat

A TermVectorsFormat that compresses chunks of documents together in order to improve the compression ratio.

WARNING: This API is experimental and might change in incompatible ways in the next release.

Constructor Summary
CompressingTermVectorsFormat(String formatName, String segmentSuffix, CompressionMode compressionMode, int chunkSize)
          Create a new CompressingTermVectorsFormat.
 
Method Summary
 String toString()
           
 TermVectorsReader vectorsReader(Directory directory, SegmentInfo segmentInfo, FieldInfos fieldInfos, IOContext context)
          Returns a TermVectorsReader to read term vectors.
 TermVectorsWriter vectorsWriter(Directory directory, SegmentInfo segmentInfo, IOContext context)
          Returns a TermVectorsWriter to write term vectors.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

CompressingTermVectorsFormat

public CompressingTermVectorsFormat(String formatName,
                                    String segmentSuffix,
                                    CompressionMode compressionMode,
                                    int chunkSize)
Create a new CompressingTermVectorsFormat.

formatName is the name of the format. This name will be used in the file formats to perform codec header checks.

The compressionMode parameter allows you to choose between compression algorithms that have various compression and decompression speeds so that you can pick the one that best fits your indexing and searching throughput. You should never instantiate two CompressingTermVectorsFormats that have the same name but different CompressionModes.

chunkSize is the minimum byte size of a chunk of documents. Higher values of chunkSize should improve the compression ratio but will require more memory at indexing time and might make document loading a little slower (depending on the size of your OS cache compared to the size of your index).

Parameters:
formatName - the name of the StoredFieldsFormat
segmentSuffix - a suffix to append to files created by this format
compressionMode - the CompressionMode to use
chunkSize - the minimum number of bytes of a single chunk of stored documents
See Also:
CompressionMode
Method Detail

vectorsReader

public final TermVectorsReader vectorsReader(Directory directory,
                                             SegmentInfo segmentInfo,
                                             FieldInfos fieldInfos,
                                             IOContext context)
                                      throws IOException
Description copied from class: TermVectorsFormat
Returns a TermVectorsReader to read term vectors.

Specified by:
vectorsReader in class TermVectorsFormat
Throws:
IOException

vectorsWriter

public final TermVectorsWriter vectorsWriter(Directory directory,
                                             SegmentInfo segmentInfo,
                                             IOContext context)
                                      throws IOException
Description copied from class: TermVectorsFormat
Returns a TermVectorsWriter to write term vectors.

Specified by:
vectorsWriter in class TermVectorsFormat
Throws:
IOException

toString

public String toString()
Overrides:
toString in class Object


Copyright © 2000-2014 Apache Software Foundation. All Rights Reserved.