Class WeightedSpanTermExtractor
java.lang.Object
org.apache.lucene.search.highlight.WeightedSpanTermExtractor
Class used to extract
WeightedSpanTerm
s from a Query
based on whether Term
s from the Query
are contained in a supplied TokenStream
.
In order to support additional, by default unsupported queries, subclasses can override extract(Query, float, Map)
for extracting wrapped or delegate queries and extractUnknownQuery(Query, Map)
to process custom leaf queries:
WeightedSpanTermExtractor extractor = new WeightedSpanTermExtractor() {
protected void extract(Query query, float boost, Map<String, WeightedSpanTerm>terms) throws IOException {
if (query instanceof QueryWrapper) {
extract(((QueryWrapper)query).getQuery(), boost, terms);
} else {
super.extract(query, boost, terms);
}
}
protected void extractUnknownQuery(Query query, Map<String, WeightedSpanTerm> terms) throws IOException {
if (query instanceOf CustomTermQuery) {
Term term = ((CustomTermQuery) query).getTerm();
terms.put(term.field(), new WeightedSpanTerm(1, term.text()));
}
}
};
}
-
Nested Class Summary
Modifier and TypeClassDescriptionprotected static class
This class makes sure that if both position sensitive and insensitive versions of the same term are added, the position insensitive one wins. -
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionprotected void
collectSpanQueryFields
(SpanQuery spanQuery, Set<String> fieldNames) protected void
extract
(Query query, float boost, Map<String, WeightedSpanTerm> terms) protected void
extractUnknownQuery
(Query query, Map<String, WeightedSpanTerm> terms) protected void
extractWeightedSpanTerms
(Map<String, WeightedSpanTerm> terms, SpanQuery spanQuery, float boost) protected void
extractWeightedTerms
(Map<String, WeightedSpanTerm> terms, Query query, float boost) protected boolean
fieldNameComparator
(String fieldNameToCheck) Necessary to implement matches for queries againstdefaultField
boolean
protected LeafReaderContext
Returns the tokenStream which may have been wrapped in a CachingTokenFilter.getWeightedSpanTerms
(Query query, float boost, TokenStream tokenStream) Creates a Map ofWeightedSpanTerms
from the givenQuery
andTokenStream
.getWeightedSpanTerms
(Query query, float boost, TokenStream tokenStream, String fieldName) Creates a Map ofWeightedSpanTerms
from the givenQuery
andTokenStream
.getWeightedSpanTermsWithScores
(Query query, float boost, TokenStream tokenStream, String fieldName, IndexReader reader) Creates a Map ofWeightedSpanTerms
from the givenQuery
andTokenStream
.boolean
protected boolean
isQueryUnsupported
(Class<? extends Query> clazz) boolean
protected boolean
mustRewriteQuery
(SpanQuery spanQuery) void
setExpandMultiTermQuery
(boolean expandMultiTermQuery) protected final void
setMaxDocCharsToAnalyze
(int maxDocCharsToAnalyze) A threshold of number of characters to analyze.void
setUsePayloads
(boolean usePayloads) void
setWrapIfNotCachingTokenFilter
(boolean wrap) By default,TokenStream
s that are not of the typeCachingTokenFilter
are wrapped in aCachingTokenFilter
to ensure an efficient reset - if you are already using a different cachingTokenStream
impl and you don't want it to be wrapped, set this to false.
-
Constructor Details
-
WeightedSpanTermExtractor
public WeightedSpanTermExtractor() -
WeightedSpanTermExtractor
-
-
Method Details
-
extract
protected void extract(Query query, float boost, Map<String, WeightedSpanTerm> terms) throws IOException- Parameters:
query
- Query to extract Terms fromterms
- Map to place created WeightedSpanTerms in- Throws:
IOException
- If there is a low-level I/O error
-
isQueryUnsupported
-
extractUnknownQuery
protected void extractUnknownQuery(Query query, Map<String, WeightedSpanTerm> terms) throws IOException- Throws:
IOException
-
extractWeightedSpanTerms
protected void extractWeightedSpanTerms(Map<String, WeightedSpanTerm> terms, SpanQuery spanQuery, float boost) throws IOException- Parameters:
terms
- Map to place created WeightedSpanTerms inspanQuery
- SpanQuery to extract Terms from- Throws:
IOException
- If there is a low-level I/O error
-
extractWeightedTerms
protected void extractWeightedTerms(Map<String, WeightedSpanTerm> terms, Query query, float boost) throws IOException- Parameters:
terms
- Map to place created WeightedSpanTerms inquery
- Query to extract Terms from- Throws:
IOException
- If there is a low-level I/O error
-
fieldNameComparator
Necessary to implement matches for queries againstdefaultField
-
getLeafContext
- Throws:
IOException
-
getWeightedSpanTerms
public Map<String,WeightedSpanTerm> getWeightedSpanTerms(Query query, float boost, TokenStream tokenStream) throws IOException Creates a Map ofWeightedSpanTerms
from the givenQuery
andTokenStream
.- Parameters:
query
- that caused hittokenStream
- of text to be highlighted- Returns:
- Map containing WeightedSpanTerms
- Throws:
IOException
- If there is a low-level I/O error
-
getWeightedSpanTerms
public Map<String,WeightedSpanTerm> getWeightedSpanTerms(Query query, float boost, TokenStream tokenStream, String fieldName) throws IOException Creates a Map ofWeightedSpanTerms
from the givenQuery
andTokenStream
.- Parameters:
query
- that caused hittokenStream
- of text to be highlightedfieldName
- restricts Term's used based on field name- Returns:
- Map containing WeightedSpanTerms
- Throws:
IOException
- If there is a low-level I/O error
-
getWeightedSpanTermsWithScores
public Map<String,WeightedSpanTerm> getWeightedSpanTermsWithScores(Query query, float boost, TokenStream tokenStream, String fieldName, IndexReader reader) throws IOException Creates a Map ofWeightedSpanTerms
from the givenQuery
andTokenStream
. Uses a suppliedIndexReader
to properly weight terms (for gradient highlighting).- Parameters:
query
- that caused hittokenStream
- of text to be highlightedfieldName
- restricts Term's used based on field namereader
- to use for scoring- Returns:
- Map of WeightedSpanTerms with quasi tf/idf scores
- Throws:
IOException
- If there is a low-level I/O error
-
collectSpanQueryFields
-
mustRewriteQuery
-
getExpandMultiTermQuery
public boolean getExpandMultiTermQuery() -
setExpandMultiTermQuery
public void setExpandMultiTermQuery(boolean expandMultiTermQuery) -
isUsePayloads
public boolean isUsePayloads() -
setUsePayloads
public void setUsePayloads(boolean usePayloads) -
isCachedTokenStream
public boolean isCachedTokenStream() -
getTokenStream
Returns the tokenStream which may have been wrapped in a CachingTokenFilter. getWeightedSpanTerms* sets the tokenStream, so don't call this before. -
setWrapIfNotCachingTokenFilter
public void setWrapIfNotCachingTokenFilter(boolean wrap) By default,TokenStream
s that are not of the typeCachingTokenFilter
are wrapped in aCachingTokenFilter
to ensure an efficient reset - if you are already using a different cachingTokenStream
impl and you don't want it to be wrapped, set this to false. This setting is ignored when a term vector based TokenStream is supplied, since it can be reset efficiently. -
setMaxDocCharsToAnalyze
protected final void setMaxDocCharsToAnalyze(int maxDocCharsToAnalyze) A threshold of number of characters to analyze. When a TokenStream based on term vectors with offsets and positions are supplied, this setting does not apply.
-