Class WeightedSpanTermExtractor
- java.lang.Object
-
- org.apache.lucene.search.highlight.WeightedSpanTermExtractor
-
public class WeightedSpanTermExtractor extends Object
Class used to extractWeightedSpanTerm
s from aQuery
based on whetherTerm
s from theQuery
are contained in a suppliedTokenStream
.In order to support additional, by default unsupported queries, subclasses can override
extract(Query, float, Map)
for extracting wrapped or delegate queries andextractUnknownQuery(Query, Map)
to process custom leaf queries:WeightedSpanTermExtractor extractor = new WeightedSpanTermExtractor() { protected void extract(Query query, float boost, Map<String, WeightedSpanTerm>terms) throws IOException { if (query instanceof QueryWrapper) { extract(((QueryWrapper)query).getQuery(), boost, terms); } else { super.extract(query, boost, terms); } } protected void extractUnknownQuery(Query query, Map<String, WeightedSpanTerm> terms) throws IOException { if (query instanceOf CustomTermQuery) { Term term = ((CustomTermQuery) query).getTerm(); terms.put(term.field(), new WeightedSpanTerm(1, term.text())); } } }; }
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected static class
WeightedSpanTermExtractor.PositionCheckingMap<K>
This class makes sure that if both position sensitive and insensitive versions of the same term are added, the position insensitive one wins.
-
Constructor Summary
Constructors Constructor Description WeightedSpanTermExtractor()
WeightedSpanTermExtractor(String defaultField)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
collectSpanQueryFields(SpanQuery spanQuery, Set<String> fieldNames)
protected void
extract(Query query, float boost, Map<String,WeightedSpanTerm> terms)
protected void
extractUnknownQuery(Query query, Map<String,WeightedSpanTerm> terms)
protected void
extractWeightedSpanTerms(Map<String,WeightedSpanTerm> terms, SpanQuery spanQuery, float boost)
protected void
extractWeightedTerms(Map<String,WeightedSpanTerm> terms, Query query, float boost)
protected boolean
fieldNameComparator(String fieldNameToCheck)
Necessary to implement matches for queries againstdefaultField
boolean
getExpandMultiTermQuery()
protected LeafReaderContext
getLeafContext()
TokenStream
getTokenStream()
Returns the tokenStream which may have been wrapped in a CachingTokenFilter.Map<String,WeightedSpanTerm>
getWeightedSpanTerms(Query query, float boost, TokenStream tokenStream)
Creates a Map ofWeightedSpanTerms
from the givenQuery
andTokenStream
.Map<String,WeightedSpanTerm>
getWeightedSpanTerms(Query query, float boost, TokenStream tokenStream, String fieldName)
Creates a Map ofWeightedSpanTerms
from the givenQuery
andTokenStream
.Map<String,WeightedSpanTerm>
getWeightedSpanTermsWithScores(Query query, float boost, TokenStream tokenStream, String fieldName, IndexReader reader)
Creates a Map ofWeightedSpanTerms
from the givenQuery
andTokenStream
.boolean
isCachedTokenStream()
protected boolean
isQueryUnsupported(Class<? extends Query> clazz)
boolean
isUsePayloads()
protected boolean
mustRewriteQuery(SpanQuery spanQuery)
void
setExpandMultiTermQuery(boolean expandMultiTermQuery)
protected void
setMaxDocCharsToAnalyze(int maxDocCharsToAnalyze)
A threshold of number of characters to analyze.void
setUsePayloads(boolean usePayloads)
void
setWrapIfNotCachingTokenFilter(boolean wrap)
By default,TokenStream
s that are not of the typeCachingTokenFilter
are wrapped in aCachingTokenFilter
to ensure an efficient reset - if you are already using a different cachingTokenStream
impl and you don't want it to be wrapped, set this to false.
-
-
-
Constructor Detail
-
WeightedSpanTermExtractor
public WeightedSpanTermExtractor()
-
WeightedSpanTermExtractor
public WeightedSpanTermExtractor(String defaultField)
-
-
Method Detail
-
extract
protected void extract(Query query, float boost, Map<String,WeightedSpanTerm> terms) throws IOException
- Parameters:
query
- Query to extract Terms fromterms
- Map to place created WeightedSpanTerms in- Throws:
IOException
- If there is a low-level I/O error
-
extractUnknownQuery
protected void extractUnknownQuery(Query query, Map<String,WeightedSpanTerm> terms) throws IOException
- Throws:
IOException
-
extractWeightedSpanTerms
protected void extractWeightedSpanTerms(Map<String,WeightedSpanTerm> terms, SpanQuery spanQuery, float boost) throws IOException
- Parameters:
terms
- Map to place created WeightedSpanTerms inspanQuery
- SpanQuery to extract Terms from- Throws:
IOException
- If there is a low-level I/O error
-
extractWeightedTerms
protected void extractWeightedTerms(Map<String,WeightedSpanTerm> terms, Query query, float boost) throws IOException
- Parameters:
terms
- Map to place created WeightedSpanTerms inquery
- Query to extract Terms from- Throws:
IOException
- If there is a low-level I/O error
-
fieldNameComparator
protected boolean fieldNameComparator(String fieldNameToCheck)
Necessary to implement matches for queries againstdefaultField
-
getLeafContext
protected LeafReaderContext getLeafContext() throws IOException
- Throws:
IOException
-
getWeightedSpanTerms
public Map<String,WeightedSpanTerm> getWeightedSpanTerms(Query query, float boost, TokenStream tokenStream) throws IOException
Creates a Map ofWeightedSpanTerms
from the givenQuery
andTokenStream
.- Parameters:
query
- that caused hittokenStream
- of text to be highlighted- Returns:
- Map containing WeightedSpanTerms
- Throws:
IOException
- If there is a low-level I/O error
-
getWeightedSpanTerms
public Map<String,WeightedSpanTerm> getWeightedSpanTerms(Query query, float boost, TokenStream tokenStream, String fieldName) throws IOException
Creates a Map ofWeightedSpanTerms
from the givenQuery
andTokenStream
.- Parameters:
query
- that caused hittokenStream
- of text to be highlightedfieldName
- restricts Term's used based on field name- Returns:
- Map containing WeightedSpanTerms
- Throws:
IOException
- If there is a low-level I/O error
-
getWeightedSpanTermsWithScores
public Map<String,WeightedSpanTerm> getWeightedSpanTermsWithScores(Query query, float boost, TokenStream tokenStream, String fieldName, IndexReader reader) throws IOException
Creates a Map ofWeightedSpanTerms
from the givenQuery
andTokenStream
. Uses a suppliedIndexReader
to properly weight terms (for gradient highlighting).- Parameters:
query
- that caused hittokenStream
- of text to be highlightedfieldName
- restricts Term's used based on field namereader
- to use for scoring- Returns:
- Map of WeightedSpanTerms with quasi tf/idf scores
- Throws:
IOException
- If there is a low-level I/O error
-
collectSpanQueryFields
protected void collectSpanQueryFields(SpanQuery spanQuery, Set<String> fieldNames)
-
mustRewriteQuery
protected boolean mustRewriteQuery(SpanQuery spanQuery)
-
getExpandMultiTermQuery
public boolean getExpandMultiTermQuery()
-
setExpandMultiTermQuery
public void setExpandMultiTermQuery(boolean expandMultiTermQuery)
-
isUsePayloads
public boolean isUsePayloads()
-
setUsePayloads
public void setUsePayloads(boolean usePayloads)
-
isCachedTokenStream
public boolean isCachedTokenStream()
-
getTokenStream
public TokenStream getTokenStream()
Returns the tokenStream which may have been wrapped in a CachingTokenFilter. getWeightedSpanTerms* sets the tokenStream, so don't call this before.
-
setWrapIfNotCachingTokenFilter
public void setWrapIfNotCachingTokenFilter(boolean wrap)
By default,TokenStream
s that are not of the typeCachingTokenFilter
are wrapped in aCachingTokenFilter
to ensure an efficient reset - if you are already using a different cachingTokenStream
impl and you don't want it to be wrapped, set this to false. This setting is ignored when a term vector based TokenStream is supplied, since it can be reset efficiently.
-
setMaxDocCharsToAnalyze
protected final void setMaxDocCharsToAnalyze(int maxDocCharsToAnalyze)
A threshold of number of characters to analyze. When a TokenStream based on term vectors with offsets and positions are supplied, this setting does not apply.
-
-