Class ChunksIntEncoder

  extended by org.apache.lucene.util.encoding.IntEncoder
      extended by org.apache.lucene.util.encoding.ChunksIntEncoder
Direct Known Subclasses:
EightFlagsIntEncoder, FourFlagsIntEncoder

public abstract class ChunksIntEncoder
extends IntEncoder

An IntEncoder which encodes values in chunks. Implementations of this class assume the data which needs encoding consists of small, consecutive values, and therefore the encoder is able to compress them better. You can read more on the two implementations FourFlagsIntEncoder and EightFlagsIntEncoder.

Extensions of this class need to implement IntEncoder.encode(int) in order to build the proper indicator (flags). When enough values were accumulated (typically the batch size), extensions can call encodeChunk() to flush the indicator and the rest of the values.

NOTE: flags encoders do not accept values ≤ 0 (zero) in their IntEncoder.encode(int). For performance reasons they do not check that condition, however if such value is passed the result stream may be corrupt or an exception will be thrown. Also, these encoders perform the best when there are many consecutive small values (depends on the encoder implementation). If that is not the case, the encoder will occupy 1 more byte for every batch number of integers, over whatever VInt8IntEncoder would have occupied. Therefore make sure to check whether your data fits into the conditions of the specific encoder.

For the reasons mentioned above, these encoders are usually chained with UniqueValuesIntEncoder and DGapIntEncoder in the following manner:

 IntEncoder fourFlags = 
         new SortingEncoderFilter(new UniqueValuesIntEncoder(new DGapIntEncoder(new FlagsIntEncoderImpl())));

WARNING: This API is experimental and might change in incompatible ways in the next release.

Field Summary
protected  int[] encodeQueue
          Holds the values which must be encoded, outside the indicator.
protected  int encodeQueueSize
protected  IntEncoder encoder
          Encoder used to encode values outside the indicator.
protected  int indicator
          Represents bits flag byte.
protected  byte ordinal
          Counts the current ordinal of the encoded value.
Fields inherited from class org.apache.lucene.util.encoding.IntEncoder
Constructor Summary
protected ChunksIntEncoder(int chunkSize)
Method Summary
 void close()
          Instructs the encoder to finish the encoding process.
protected  void encodeChunk()
          Encodes the values of the current chunk.
 void reInit(OutputStream out)
          Reinitializes the encoder with the give OutputStream.
Methods inherited from class org.apache.lucene.util.encoding.IntEncoder
createMatchingDecoder, encode
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail


protected final int[] encodeQueue
Holds the values which must be encoded, outside the indicator.


protected int encodeQueueSize


protected final IntEncoder encoder
Encoder used to encode values outside the indicator.


protected int indicator
Represents bits flag byte.


protected byte ordinal
Counts the current ordinal of the encoded value.

Constructor Detail


protected ChunksIntEncoder(int chunkSize)
Method Detail


protected void encodeChunk()
                    throws IOException
Encodes the values of the current chunk. First it writes the indicator, and then it encodes the values outside the indicator.



public void close()
           throws IOException
Description copied from class: IntEncoder
Instructs the encoder to finish the encoding process. This method closes the output stream which was specified by reInit. An implementation may do here additional cleanup required to complete the encoding, such as flushing internal buffers, etc.
Once this method was called, no further calls to encode should be made before first calling reInit.

NOTE: overriding classes should make sure they either call super.close() or close the output stream themselves.

close in class IntEncoder


public void reInit(OutputStream out)
Description copied from class: IntEncoder
Reinitializes the encoder with the give OutputStream. For re-usability it can be changed without the need to reconstruct a new object.

NOTE: after calling IntEncoder.close(), one must call this method even if the output stream itself hasn't changed. An example case is that the output stream wraps a byte[], and the output stream itself is reset, but its instance hasn't changed. Some implementations of IntEncoder may write some metadata about themselves to the output stream, and therefore it is imperative that one calls this method before encoding any data.

reInit in class IntEncoder

Copyright © 2000-2011 Apache Software Foundation. All Rights Reserved.