public final class BlockTreeTermsReader extends FieldsProducer
NOTE: this terms dictionary supports min/maxItemsPerBlock during indexing to control how much memory the terms index uses.
If auto-prefix terms were indexed (see
BlockTreeTermsWriter
), then the Terms.intersect(org.apache.lucene.util.automaton.CompiledAutomaton, org.apache.lucene.util.BytesRef)
implementation here will make use of these terms only if the
automaton has a binary sink state, i.e. an accept state
which has a transition to itself accepting all byte values.
For example, both PrefixQuery
and TermRangeQuery
pass such automata to Terms.intersect(org.apache.lucene.util.automaton.CompiledAutomaton, org.apache.lucene.util.BytesRef)
.
The data structure used by this implementation is very similar to a burst trie (http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.18.3499), but with added logic to break up too-large blocks of all terms sharing a given prefix into smaller ones.
Use CheckIndex
with the -verbose
option to see summary statistics on the blocks in the
dictionary.
See BlockTreeTermsWriter
.
Modifier and Type | Field and Description |
---|---|
static int |
VERSION_AUTO_PREFIX_TERMS
Auto-prefix terms.
|
static int |
VERSION_AUTO_PREFIX_TERMS_COND
Conditional auto-prefix terms: we record at write time whether
this field did write any auto-prefix terms.
|
static int |
VERSION_CURRENT
Current terms format.
|
static int |
VERSION_START
Initial terms format.
|
EMPTY_ARRAY
Constructor and Description |
---|
BlockTreeTermsReader(PostingsReaderBase postingsReader,
SegmentReadState state)
Sole constructor.
|
Modifier and Type | Method and Description |
---|---|
void |
checkIntegrity()
Checks consistency of this reader.
|
void |
close() |
Collection<Accountable> |
getChildResources()
Returns nested resources of this class.
|
Iterator<String> |
iterator()
Returns an iterator that will step through all fields
names.
|
long |
ramBytesUsed()
Return the memory usage of this object in bytes.
|
int |
size()
Returns the number of fields or -1 if the number of
distinct field names is unknown.
|
Terms |
terms(String field)
Get the
Terms for this field. |
String |
toString() |
getMergeInstance
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
forEach, spliterator
public static final int VERSION_START
public static final int VERSION_AUTO_PREFIX_TERMS
public static final int VERSION_AUTO_PREFIX_TERMS_COND
public static final int VERSION_CURRENT
public BlockTreeTermsReader(PostingsReaderBase postingsReader, SegmentReadState state) throws IOException
IOException
public void close() throws IOException
close
in interface Closeable
close
in interface AutoCloseable
close
in class FieldsProducer
IOException
public Iterator<String> iterator()
Fields
public Terms terms(String field) throws IOException
Fields
Terms
for this field. This will return
null if the field does not exist.terms
in class Fields
IOException
public int size()
Fields
Fields.iterator()
will return as many field names.public long ramBytesUsed()
Accountable
public Collection<Accountable> getChildResources()
Accountable
Accountables
public void checkIntegrity() throws IOException
FieldsProducer
Note that this may be costly in terms of I/O, e.g. may involve computing a checksum value against large data files.
checkIntegrity
in class FieldsProducer
IOException
Copyright © 2000-2016 Apache Software Foundation. All Rights Reserved.