Class IDVersionPostingsFormat

  • All Implemented Interfaces:
    NamedSPILoader.NamedSPI

    public class IDVersionPostingsFormat
    extends PostingsFormat
    A PostingsFormat optimized for primary-key (ID) fields that also record a version (long) for each ID, delivered as a payload created by longToBytes(long, org.apache.lucene.util.BytesRef) during indexing. At search time, the TermsEnum implementation IDVersionSegmentTermsEnum enables fast (using only the terms index when possible) lookup for whether a given ID was previously indexed with version > N (see IDVersionSegmentTermsEnum.seekExact(BytesRef,long).

    This is most effective if the app assigns monotonically increasing global version to each indexed doc. Then, during indexing, use IDVersionSegmentTermsEnum.seekExact(BytesRef,long) (along with LiveFieldValues) to decide whether the document you are about to index was already indexed with a higher version, and skip it if so.

    The field is effectively indexed as DOCS_ONLY and the docID is pulsed into the terms dictionary, but the user must feed in the version as a payload on the first token.

    NOTE: term vectors cannot be indexed with this field (not that you should really ever want to do this).

    WARNING: This API is experimental and might change in incompatible ways in the next release.