Class BlockTreeTermsReader

  • All Implemented Interfaces:
    Closeable, AutoCloseable, Iterable<String>, Accountable

    public final class BlockTreeTermsReader
    extends FieldsProducer
    A block-based terms index and dictionary that assigns terms to variable length blocks according to how they share prefixes. The terms index is a prefix trie whose leaves are term blocks. The advantage of this approach is that seekExact is often able to determine a term cannot exist without doing any IO, and intersection with Automata is very fast. Note that this terms dictionary has its own fixed terms index (ie, it does not support a pluggable terms index implementation).

    NOTE: this terms dictionary supports min/maxItemsPerBlock during indexing to control how much memory the terms index uses.

    The data structure used by this implementation is very similar to a burst trie (, but with added logic to break up too-large blocks of all terms sharing a given prefix into smaller ones.

    Use CheckIndex with the -verbose option to see summary statistics on the blocks in the dictionary. See BlockTreeTermsWriter.

    WARNING: This API is experimental and might change in incompatible ways in the next release.
    • Field Detail


        public static final int VERSION_START
        Initial terms format.
        See Also:
        Constant Field Values

        public static final int VERSION_META_LONGS_REMOVED
        The long[] + byte[] metadata has been replaced with a single byte[].
        See Also:
        Constant Field Values

        public static final int VERSION_COMPRESSED_SUFFIXES
        Suffixes are compressed to save space.
        See Also:
        Constant Field Values

        public static final int VERSION_META_FILE
        Metadata is written to its own file.
        See Also:
        Constant Field Values

        public static final int VERSION_CURRENT
        Current terms format.
        See Also:
        Constant Field Values