java.lang.Object

org.apache.lucene.codecs.CodecUtil

public final class CodecUtil extends Object

Utility class for reading and writing versioned headers.

Writing codec headers is useful to ensure that a file is in the format you think it is.

WARNING: This API is experimental and might change in incompatible ways in the next release.

Field Summary

Fields

Modifier and Type

Field

Description

static final int

CODEC_MAGIC

Constant to identify the start of a codec header.

static final int

FOOTER_MAGIC

Constant to identify the start of a codec footer.
Method Summary

Modifier and Type

Method

Description

static long

checkFooter(ChecksumIndexInput in)

Validates the codec footer previously written by writeFooter(org.apache.lucene.store.IndexOutput).

static void

checkFooter(ChecksumIndexInput in, Throwable priorException)

Validates the codec footer previously written by writeFooter(org.apache.lucene.store.IndexOutput), optionally passing an unexpected exception that has already occurred.

static int

checkHeader(DataInput in, String codec, int minVersion, int maxVersion)

Reads and validates a header previously written with writeHeader(DataOutput, String, int).

static int

checkHeaderNoMagic(DataInput in, String codec, int minVersion, int maxVersion)

Like checkHeader(DataInput,String,int,int) except this version assumes the first int has already been read and validated from the input.

static int

checkIndexHeader(DataInput in, String codec, int minVersion, int maxVersion, byte[] expectedID, String expectedSuffix)

Reads and validates a header previously written with writeIndexHeader(DataOutput, String, int, byte[], String).

static byte[]

checkIndexHeaderID(DataInput in, byte[] expectedID)

Expert: just reads and verifies the object ID of an index header

static String

checkIndexHeaderSuffix(DataInput in, String expectedSuffix)

Expert: just reads and verifies the suffix of an index header

static long

checksumEntireFile(IndexInput input)

Clones the provided input, reads all bytes from the file, and calls checkFooter(org.apache.lucene.store.ChecksumIndexInput)

static int

footerLength()

Computes the length of a codec footer.

static int

headerLength(String codec)

Computes the length of a codec header.

static int

indexHeaderLength(String codec, String suffix)

Computes the length of an index header.

static int

readBEInt(DataInput in)

read int value from header / footer with big endian order

static long

readBELong(DataInput in)

read long value from header / footer with big endian order

static byte[]

readFooter(IndexInput in)

Retrieves the full footer from the provided IndexInput.

static byte[]

readIndexHeader(IndexInput in)

Retrieves the full index header from the provided IndexInput.

static long

retrieveChecksum(IndexInput in)

Returns (but does not validate) the checksum previously written by checkFooter(org.apache.lucene.store.ChecksumIndexInput).

static long

retrieveChecksum(IndexInput in, long expectedLength)

Returns (but does not validate) the checksum previously written by checkFooter(org.apache.lucene.store.ChecksumIndexInput).

static void

verifyAndCopyIndexHeader(IndexInput in, DataOutput out, byte[] expectedID)

Expert: verifies the incoming IndexInput has an index header and that its segment ID matches the expected one, and then copies that index header into the provided DataOutput.

static void

writeBEInt(DataOutput out, int i)

write int value on header / footer with big endian order

static void

writeBELong(DataOutput out, long l)

write long value on header / footer with big endian order

static void

writeFooter(IndexOutput out)

Writes a codec footer, which records both a checksum algorithm ID and a checksum.

static void

writeHeader(DataOutput out, String codec, int version)

Writes a codec header, which records both a string to identify the file and a version number.

static void

writeIndexHeader(DataOutput out, String codec, int version, byte[] id, String suffix)

Writes a codec header for an index file, which records both a string to identify the format of the file, a version number, and data to identify the file instance (ID and auxiliary suffix such as generation).

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Details
- CODEC_MAGIC
  
  public static final int CODEC_MAGIC
  
  Constant to identify the start of a codec header.
  See Also:
  
  Constant Field Values
- FOOTER_MAGIC
  
  public static final int FOOTER_MAGIC
  
  Constant to identify the start of a codec footer.
  See Also:
  
  Constant Field Values
Method Details
- writeHeader
  
  public static void writeHeader(DataOutput out, String codec, int version) throws IOException
  Writes a codec header, which records both a string to identify the file and a version number. This header can be parsed and validated with checkHeader().
  CodecHeader --> Magic,CodecName,Version
  
  Magic --> Uint32. This identifies the start of the header. It is always 1071082519.
  CodecName --> String. This is a string to identify this file.
  Version --> Uint32. Records the version of the file.
  
  Note that the length of a codec header depends only upon the name of the codec, so this length can be computed at any time with headerLength(String).
  Parameters:
  
  out - Output stream
  
  codec - String to identify this file. It should be simple ASCII, less than 128 characters in length.
  
  version - Version number
  
  Throws:
  
  IOException - If there is an I/O error writing to the underlying medium.
  
  IllegalArgumentException - If the codec name is not simple ASCII, or is more than 127 characters in length
- writeIndexHeader
  
  public static void writeIndexHeader(DataOutput out, String codec, int version, byte[] id, String suffix) throws IOException
  Writes a codec header for an index file, which records both a string to identify the format of the file, a version number, and data to identify the file instance (ID and auxiliary suffix such as generation).
  This header can be parsed and validated with checkIndexHeader().
  IndexHeader --> CodecHeader,ObjectID,ObjectSuffix
  
  CodecHeader --> writeHeader(org.apache.lucene.store.DataOutput, java.lang.String, int)
  ObjectID --> byte¹⁶
  ObjectSuffix --> SuffixLength,SuffixBytes
  SuffixLength --> byte
  SuffixBytes --> byte^SuffixLength
  
  Note that the length of an index header depends only upon the name of the codec and suffix, so this length can be computed at any time with indexHeaderLength(String,String).
  Parameters:
  
  out - Output stream
  
  codec - String to identify the format of this file. It should be simple ASCII, less than 128 characters in length.
  
  id - Unique identifier for this particular file instance.
  
  suffix - auxiliary suffix information for the file. It should be simple ASCII, less than 256 characters in length.
  
  version - Version number
  
  Throws:
  
  IOException - If there is an I/O error writing to the underlying medium.
  
  IllegalArgumentException - If the codec name is not simple ASCII, or is more than 127 characters in length, or if id is invalid, or if the suffix is not simple ASCII, or more than 255 characters in length.
- headerLength
  
  public static int headerLength(String codec)
  
  Computes the length of a codec header.
  Parameters:
  
  codec - Codec name.
  
  Returns:
  
  length of the entire codec header.
  
  See Also:
  
  writeHeader(DataOutput, String, int)
- indexHeaderLength
  
  public static int indexHeaderLength(String codec, String suffix)
  
  Computes the length of an index header.
  Parameters:
  
  codec - Codec name.
  
  Returns:
  
  length of the entire index header.
  
  See Also:
  
  writeIndexHeader(DataOutput, String, int, byte[], String)
- checkHeader
  
  public static int checkHeader(DataInput in, String codec, int minVersion, int maxVersion) throws IOException
  
  Reads and validates a header previously written with writeHeader(DataOutput, String, int).
  When reading a file, supply the expected codec and an expected version range ( minVersion to maxVersion).
  Parameters:
  
  in - Input stream, positioned at the point where the header was previously written. Typically this is located at the beginning of the file.
  
  codec - The expected codec name.
  
  minVersion - The minimum supported expected version number.
  
  maxVersion - The maximum supported expected version number.
  
  Returns:
  
  The actual version found, when a valid header is found that matches codec, with an actual version where minVersion <= actual <= maxVersion. Otherwise an exception is thrown.
  
  Throws:
  
  CorruptIndexException - If the first four bytes are not CODEC_MAGIC, or if the actual codec found is not codec.
  
  IndexFormatTooOldException - If the actual version is less than minVersion.
  
  IndexFormatTooNewException - If the actual version is greater than maxVersion.
  
  IOException - If there is an I/O error reading from the underlying medium.
  
  See Also:
  
  writeHeader(DataOutput, String, int)
- checkHeaderNoMagic
  
  public static int checkHeaderNoMagic(DataInput in, String codec, int minVersion, int maxVersion) throws IOException
  
  Like checkHeader(DataInput,String,int,int) except this version assumes the first int has already been read and validated from the input.
  
  Throws:
  
  IOException
- checkIndexHeader
  
  public static int checkIndexHeader(DataInput in, String codec, int minVersion, int maxVersion, byte[] expectedID, String expectedSuffix) throws IOException
  
  Reads and validates a header previously written with writeIndexHeader(DataOutput, String, int, byte[], String).
  When reading a file, supply the expected codec, expected version range (minVersion to maxVersion), and object ID and suffix.
  Parameters:
  
  in - Input stream, positioned at the point where the header was previously written. Typically this is located at the beginning of the file.
  
  codec - The expected codec name.
  
  minVersion - The minimum supported expected version number.
  
  maxVersion - The maximum supported expected version number.
  
  expectedID - The expected object identifier for this file.
  
  expectedSuffix - The expected auxiliary suffix for this file.
  
  Returns:
  
  The actual version found, when a valid header is found that matches codec, with an actual version where minVersion <= actual <= maxVersion, and matching expectedID and expectedSuffix Otherwise an exception is thrown.
  
  Throws:
  
  CorruptIndexException - If the first four bytes are not CODEC_MAGIC, or if the actual codec found is not codec, or if the expectedID or expectedSuffix do not match.
  
  IndexFormatTooOldException - If the actual version is less than minVersion.
  
  IndexFormatTooNewException - If the actual version is greater than maxVersion.
  
  IOException - If there is an I/O error reading from the underlying medium.
  
  See Also:
  
  writeIndexHeader(DataOutput, String, int, byte[],String)
- verifyAndCopyIndexHeader
  
  public static void verifyAndCopyIndexHeader(IndexInput in, DataOutput out, byte[] expectedID) throws IOException
  
  Expert: verifies the incoming IndexInput has an index header and that its segment ID matches the expected one, and then copies that index header into the provided DataOutput. This is useful when building compound files.
  
  Parameters:
  
  in - Input stream, positioned at the point where the index header was previously written. Typically this is located at the beginning of the file.
  
  out - Output stream, where the header will be copied to.
  
  expectedID - Expected segment ID
  
  Throws:
  
  CorruptIndexException - If the first four bytes are not CODEC_MAGIC, or if the expectedID does not match.
  
  IOException - If there is an I/O error reading from the underlying medium.
  
  NOTE: This API is for internal purposes only and might change in incompatible ways in the next release.
- readIndexHeader
  
  public static byte[] readIndexHeader(IndexInput in) throws IOException
  
  Retrieves the full index header from the provided IndexInput. This throws CorruptIndexException if this file does not appear to be an index file.
  
  Throws:
  
  IOException
- readFooter
  
  public static byte[] readFooter(IndexInput in) throws IOException
  
  Retrieves the full footer from the provided IndexInput. This throws CorruptIndexException if this file does not have a valid footer.
  
  Throws:
  
  IOException
- checkIndexHeaderID
  
  public static byte[] checkIndexHeaderID(DataInput in, byte[] expectedID) throws IOException
  
  Expert: just reads and verifies the object ID of an index header
  
  Throws:
  
  IOException
- checkIndexHeaderSuffix
  
  public static String checkIndexHeaderSuffix(DataInput in, String expectedSuffix) throws IOException
  
  Expert: just reads and verifies the suffix of an index header
  
  Throws:
  
  IOException
- writeFooter
  
  public static void writeFooter(IndexOutput out) throws IOException
  Writes a codec footer, which records both a checksum algorithm ID and a checksum. This footer can be parsed and validated with checkFooter().
  CodecFooter --> Magic,AlgorithmID,Checksum
  
  Magic --> Uint32. This identifies the start of the footer. It is always -1071082520.
  AlgorithmID --> Uint32. This indicates the checksum algorithm used. Currently this is always 0, for zlib-crc32.
  Checksum --> Uint64. The actual checksum value for all previous bytes in the stream, including the bytes from Magic and AlgorithmID.
  Parameters:
  
  out - Output stream
  
  Throws:
  
  IOException - If there is an I/O error writing to the underlying medium.
- footerLength
  
  public static int footerLength()
  
  Computes the length of a codec footer.
  Returns:
  
  length of the entire codec footer.
  
  See Also:
  
  writeFooter(IndexOutput)
- checkFooter
  
  public static long checkFooter(ChecksumIndexInput in) throws IOException
  
  Validates the codec footer previously written by writeFooter(org.apache.lucene.store.IndexOutput).
  
  Returns:
  
  actual checksum value
  
  Throws:
  
  IOException - if the footer is invalid, if the checksum does not match, or if in is not properly positioned before the footer at the end of the stream.
- checkFooter
  
  public static void checkFooter(ChecksumIndexInput in, Throwable priorException) throws IOException
  Validates the codec footer previously written by writeFooter(org.apache.lucene.store.IndexOutput), optionally passing an unexpected exception that has already occurred.
  When a priorException is provided, this method will add a suppressed exception indicating whether the checksum for the stream passes, fails, or cannot be computed, and rethrow it. Otherwise it behaves the same as checkFooter(ChecksumIndexInput).
  Example usage:
  try (ChecksumIndexInput input = ...) { Throwable priorE = null; try { // ... read a bunch of stuff ... } catch (Throwable exception) { priorE = exception; } finally { CodecUtil.checkFooter(input, priorE); } }
  Throws:
  
  IOException
- retrieveChecksum
  
  public static long retrieveChecksum(IndexInput in) throws IOException
  
  Returns (but does not validate) the checksum previously written by checkFooter(org.apache.lucene.store.ChecksumIndexInput).
  
  Returns:
  
  actual checksum value
  
  Throws:
  
  IOException - if the footer is invalid
- retrieveChecksum
  
  public static long retrieveChecksum(IndexInput in, long expectedLength) throws IOException
  
  Returns (but does not validate) the checksum previously written by checkFooter(org.apache.lucene.store.ChecksumIndexInput).
  
  Returns:
  
  actual checksum value
  
  Throws:
  
  IOException - if the footer is invalid
- checksumEntireFile
  
  public static long checksumEntireFile(IndexInput input) throws IOException
  
  Clones the provided input, reads all bytes from the file, and calls checkFooter(org.apache.lucene.store.ChecksumIndexInput)
  Note that this method may be slow, as it must process the entire file. If you just need to extract the checksum value, call retrieveChecksum(org.apache.lucene.store.IndexInput).
  
  Throws:
  
  IOException
- writeBEInt
  
  public static void writeBEInt(DataOutput out, int i) throws IOException
  
  write int value on header / footer with big endian order
  
  Throws:
  
  IOException
- writeBELong
  
  public static void writeBELong(DataOutput out, long l) throws IOException
  
  write long value on header / footer with big endian order
  
  Throws:
  
  IOException
- readBEInt
  
  public static int readBEInt(DataInput in) throws IOException
  
  read int value from header / footer with big endian order
  
  Throws:
  
  IOException
- readBELong
  
  public static long readBELong(DataInput in) throws IOException
  
  read long value from header / footer with big endian order
  
  Throws:
  
  IOException

Class CodecUtil

Field Summary

Method Summary

Methods inherited from class java.lang.Object

Field Details

CODEC_MAGIC

FOOTER_MAGIC

Method Details

writeHeader

writeIndexHeader

headerLength

indexHeaderLength

checkHeader

checkHeaderNoMagic

checkIndexHeader

verifyAndCopyIndexHeader

readIndexHeader

readFooter

checkIndexHeaderID

checkIndexHeaderSuffix

writeFooter

footerLength

checkFooter

checkFooter

retrieveChecksum

retrieveChecksum

checksumEntireFile

writeBEInt

writeBELong

readBEInt

readBELong