org.apache.solr.request
Class UnInvertedField
java.lang.Object
org.apache.lucene.index.DocTermOrds
org.apache.solr.request.UnInvertedField
public class UnInvertedField
- extends DocTermOrds
Final form of the un-inverted field:
Each document points to a list of term numbers that are contained in that document.
Term numbers are in sorted order, and are encoded as variable-length deltas from the
previous term number. Real term numbers start at 2 since 0 and 1 are reserved. A
term number of 0 signals the end of the termNumber list.
There is a single int[maxDoc()] which either contains a pointer into a byte[] for
the termNumber lists, or directly contains the termNumber list if it fits in the 4
bytes of an integer. If the first byte in the integer is 1, the next 3 bytes
are a pointer into a byte[] where the termNumber list starts.
There are actually 256 byte arrays, to compensate for the fact that the pointers
into the byte arrays are only 3 bytes long. The correct byte array for a document
is a function of it's id.
To save space and speed up faceting, any term that matches enough documents will
not be un-inverted... it will be skipped while building the un-inverted field structure,
and will use a set intersection method during faceting.
To further save memory, the terms (the actual string values) are not all stored in
memory, but a TermIndex is used to convert term numbers to term values only
for the terms needed after faceting has completed. Only every 128th term value
is stored, along with it's corresponding term number, and this is used as an
index to find the closest term and iterate until the desired number is hit (very
much like Lucene's own internal term index).
Fields inherited from class org.apache.lucene.index.DocTermOrds |
DEFAULT_INDEX_INTERVAL_BITS, docsEnum, field, index, indexedTermsArray, maxTermDocFreq, numTermsInField, ordBase, phase1_time, prefix, sizeOfIndexedStrings, termInstances, tnums, total_time |
Method Summary |
NamedList<Integer> |
getCounts(SolrIndexSearcher searcher,
DocSet baseDocs,
int offset,
int limit,
Integer mincount,
boolean missing,
String sort,
String prefix)
|
int |
getNumTerms()
|
StatsValues |
getStats(SolrIndexSearcher searcher,
DocSet baseDocs,
String[] facet)
Collect statistics about the UninvertedField. |
static UnInvertedField |
getUnInvertedField(String field,
SolrIndexSearcher searcher)
|
long |
memSize()
|
protected void |
setActualDocFreq(int termNum,
int docFreq)
|
String |
toString()
|
protected void |
visitTerm(TermsEnum te,
int termNum)
|
UnInvertedField
public UnInvertedField(String field,
SolrIndexSearcher searcher)
throws IOException
- Throws:
IOException
visitTerm
protected void visitTerm(TermsEnum te,
int termNum)
throws IOException
- Overrides:
visitTerm
in class DocTermOrds
- Throws:
IOException
setActualDocFreq
protected void setActualDocFreq(int termNum,
int docFreq)
- Overrides:
setActualDocFreq
in class DocTermOrds
memSize
public long memSize()
getNumTerms
public int getNumTerms()
getCounts
public NamedList<Integer> getCounts(SolrIndexSearcher searcher,
DocSet baseDocs,
int offset,
int limit,
Integer mincount,
boolean missing,
String sort,
String prefix)
throws IOException
- Throws:
IOException
getStats
public StatsValues getStats(SolrIndexSearcher searcher,
DocSet baseDocs,
String[] facet)
throws IOException
- Collect statistics about the UninvertedField. Code is very similar to
getCounts(org.apache.solr.search.SolrIndexSearcher, org.apache.solr.search.DocSet, int, int, Integer, boolean, String, String)
It can be used to calculate stats on multivalued fields.
This method is mainly used by the StatsComponent
.
- Parameters:
searcher
- The Searcher to use to gather the statisticsbaseDocs
- The DocSet
to gather the stats onfacet
- One or more fields to facet on.
- Returns:
- The
StatsValues
collected
- Throws:
IOException
- If there is a low-level I/O error.
toString
public String toString()
- Overrides:
toString
in class Object
getUnInvertedField
public static UnInvertedField getUnInvertedField(String field,
SolrIndexSearcher searcher)
throws IOException
- Throws:
IOException
Copyright © 2000-2013 Apache Software Foundation. All Rights Reserved.