org.apache.lucene.search.uhighlight.UnifiedHighlighter

public class UnifiedHighlighter extends Object

A Highlighter that can get offsets from either postings (IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS), term vectors (FieldType.setStoreTermVectorOffsets(boolean)), or via re-analyzing text.

This highlighter treats the single original document as the whole corpus, and then scores individual passages as if they were documents in this corpus. It uses a BreakIterator to find passages in the text; by default it breaks using getSentenceInstance(Locale.ROOT). It then iterates in parallel (merge sorting by offset) through the positions of all terms from the query, coalescing those hits that occur in a single passage into a Passage, and then scores each Passage using a separate PassageScorer. Passages are finally formatted into highlighted snippets with a PassageFormatter.

You can customize the behavior by calling some of the setters, or by subclassing and overriding some methods. Some important hooks:

getBreakIterator(String): Customize how the text is divided into passages.
getScorer(String): Customize how passages are ranked.
getFormatter(String): Customize how snippets are formatted.

This is thread-safe, notwithstanding the setters.

WARNING: This API is experimental and might change in incompatible ways in the next release.

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

static class

UnifiedHighlighter.Builder

Builder for UnifiedHighlighter.

static enum

UnifiedHighlighter.HighlightFlag

Flags for controlling highlighting behavior.

protected static class

UnifiedHighlighter.LimitedStoredFieldVisitor

Fetches stored fields for highlighting.

static enum

UnifiedHighlighter.OffsetSource

Source of term offsets; essential for highlighting.
Field Summary

Fields

Modifier and Type

Field

Description

static final int

DEFAULT_CACHE_CHARS_THRESHOLD

static final int

DEFAULT_MAX_LENGTH

protected FieldInfos

fieldInfos

protected final Analyzer

indexAnalyzer

protected static final char

MULTIVAL_SEP_CHAR

protected final IndexSearcher

searcher

protected static final LabelledCharArrayMatcher[]

ZERO_LEN_AUTOMATA_ARRAY
Constructor Summary

Constructors

Constructor

Description

UnifiedHighlighter(IndexSearcher indexSearcher, Analyzer indexAnalyzer)

Deprecated.

UnifiedHighlighter(UnifiedHighlighter.Builder builder)

Constructs the highlighter with the given UnifiedHighlighter.Builder.
Method Summary

Modifier and Type

Method

Description

static UnifiedHighlighter.Builder

builder(IndexSearcher searcher, Analyzer indexAnalyzer)

Creates a UnifiedHighlighter.Builder object where IndexSearcher and Analyzer are not null.

static UnifiedHighlighter.Builder

builderWithoutSearcher(Analyzer indexAnalyzer)

Creates a UnifiedHighlighter.Builder object in which you can only use highlightWithoutSearcher(String, Query, String, int) for highlighting.

protected Set<UnifiedHighlighter.HighlightFlag>

evaluateFlags(boolean shouldHandleMultiTermQuery, boolean shouldHighlightPhrasesStrictly, boolean shouldPassageRelevancyOverSpeed, boolean shouldEnableWeightMatches)

This method returns the set of of UnifiedHighlighter.HighlightFlags, which will be applied to the UH object.

protected Set<UnifiedHighlighter.HighlightFlag>

evaluateFlags(UnifiedHighlighter uh)

Deprecated.

protected Set<UnifiedHighlighter.HighlightFlag>

evaluateFlags(UnifiedHighlighter.Builder uhBuilder)

Evaluate the highlight flags and set the flags variable.

protected static Set<Term>

extractTerms(Query query)

Extracts matching terms after rewriting against an empty index

protected static BytesRef[]

filterExtractedTerms(Predicate<String> fieldMatcher, Set<Term> queryTerms)

protected LabelledCharArrayMatcher[]

getAutomata(String field, Query query, Set<UnifiedHighlighter.HighlightFlag> highlightFlags)

protected BreakIterator

getBreakIterator(String field)

Returns the BreakIterator to use for dividing text into passages.

int

getCacheFieldValCharsThreshold()

Limits the amount of field value pre-fetching until this threshold is passed.

protected FieldHighlighter

getFieldHighlighter(String field, Query query, Set<Term> allTerms, int maxPassages)

protected FieldInfo

getFieldInfo(String field)

Called by the default implementation of getOffsetSource(String).

protected Predicate<String>

getFieldMatcher(String field)

Returns the predicate to use for extracting the query part that must be highlighted.

protected Set<UnifiedHighlighter.HighlightFlag>

getFlags(String field)

Returns the UnifiedHighlighter.HighlightFlags applicable for the current UH instance.

protected PassageFormatter

getFormatter(String field)

Returns the PassageFormatter to use for formatting passages into highlighted snippets.

protected UHComponents

getHighlightComponents(String field, Query query, Set<Term> allTerms)

Analyzer

getIndexAnalyzer()

...

IndexSearcher

getIndexSearcher()

...

int

getMaxLength()

The maximum content size to process.

protected int

getMaxNoHighlightPassages(String field)

Returns the number of leading passages (as delineated by the BreakIterator) when no highlights could be found.

protected UnifiedHighlighter.OffsetSource

getOffsetSource(String field)

Determine the offset source for the specified field.

protected FieldOffsetStrategy

getOffsetStrategy(UnifiedHighlighter.OffsetSource offsetSource, UHComponents components)

protected UnifiedHighlighter.OffsetSource

getOptimizedOffsetSource(UHComponents components)

protected PhraseHelper

getPhraseHelper(String field, Query query, Set<UnifiedHighlighter.HighlightFlag> highlightFlags)

protected PassageScorer

getScorer(String field)

Returns the PassageScorer to use for ranking passages.

protected boolean

hasUnrecognizedQuery(Predicate<String> fieldMatcher, Query query)

String[]

highlight(String field, Query query, TopDocs topDocs)

Highlights the top passages from a single field.

String[]

highlight(String field, Query query, TopDocs topDocs, int maxPassages)

Highlights the top-N passages from a single field.

Map<String,String[]>

highlightFields(String[] fieldsIn, Query query, int[] docidsIn, int[] maxPassagesIn)

Highlights the top-N passages from multiple fields, for the provided int[] docids.

Map<String,String[]>

highlightFields(String[] fields, Query query, TopDocs topDocs)

Highlights the top passages from multiple fields.

Map<String,String[]>

highlightFields(String[] fields, Query query, TopDocs topDocs, int[] maxPassages)

Highlights the top-N passages from multiple fields.

protected Map<String,Object[]>

highlightFieldsAsObjects(String[] fieldsIn, Query query, int[] docIdsIn, int[] maxPassagesIn)

Expert: highlights the top-N passages from multiple fields, for the provided int[] docids, to custom Object as returned by the PassageFormatter.

Object

highlightWithoutSearcher(String field, Query query, String content, int maxPassages)

Highlights text passed as a parameter.

protected List<CharSequence[]>

loadFieldValues(String[] fields, DocIdSetIterator docIter, int cacheCharsThreshold)

Loads the String values for each docId by field to be highlighted.

protected UnifiedHighlighter.LimitedStoredFieldVisitor

newLimitedStoredFieldsVisitor(String[] fields)

protected Collection<Query>

preSpanQueryRewrite(Query query)

When highlighting phrases accurately, we may need to handle custom queries that aren't supported in the WeightedSpanTermExtractor as called by the PhraseHelper.

protected Boolean

requiresRewrite(SpanQuery spanQuery)

When highlighting phrases accurately, we need to know which SpanQuery's need to have Query.rewrite(IndexReader) called on them.

void

setBreakIterator(Supplier<BreakIterator> breakIterator)

Deprecated.

void

setCacheFieldValCharsThreshold(int cacheFieldValCharsThreshold)

Deprecated.

void

setFieldMatcher(Predicate<String> predicate)

Deprecated.

void

setFormatter(PassageFormatter formatter)

Deprecated.

void

setHandleMultiTermQuery(boolean handleMtq)

Deprecated.

void

setHighlightPhrasesStrictly(boolean highlightPhrasesStrictly)

Deprecated.

void

setMaxLength(int maxLength)

Deprecated.

void

setMaxNoHighlightPassages(int defaultMaxNoHighlightPassages)

Deprecated.

void

setPassageRelevancyOverSpeed(boolean passageRelevancyOverSpeed)

Deprecated.

void

setScorer(PassageScorer scorer)

Deprecated.

void

setWeightMatches(boolean weightMatches)

Deprecated.

protected boolean

shouldHandleMultiTermQuery(String field)

Deprecated.

protected boolean

shouldHighlightPhrasesStrictly(String field)

Deprecated.

protected boolean

shouldPreferPassageRelevancyOverSpeed(String field)

Deprecated.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Details
- MULTIVAL_SEP_CHAR
  
  protected static final char MULTIVAL_SEP_CHAR
  See Also:
  
  Constant Field Values
- DEFAULT_MAX_LENGTH
  
  public static final int DEFAULT_MAX_LENGTH
  See Also:
  
  Constant Field Values
- DEFAULT_CACHE_CHARS_THRESHOLD
  
  public static final int DEFAULT_CACHE_CHARS_THRESHOLD
  See Also:
  
  Constant Field Values
- ZERO_LEN_AUTOMATA_ARRAY
  
  protected static final LabelledCharArrayMatcher[] ZERO_LEN_AUTOMATA_ARRAY
- searcher
  
  protected final IndexSearcher searcher
- indexAnalyzer
  
  protected final Analyzer indexAnalyzer
- fieldInfos
  
  protected volatile FieldInfos fieldInfos
Constructor Details
- UnifiedHighlighter
  
  @Deprecated public UnifiedHighlighter(IndexSearcher indexSearcher, Analyzer indexAnalyzer)
  
  Deprecated.
  
  Constructs the highlighter with the given index searcher and analyzer.
  
  Parameters:
  
  indexSearcher - Usually required, unless highlightWithoutSearcher(String, Query, String, int) is used, in which case this needs to be null.
  
  indexAnalyzer - Required, even if in some circumstances it isn't used.
- UnifiedHighlighter
  
  public UnifiedHighlighter(UnifiedHighlighter.Builder builder)
  
  Constructs the highlighter with the given UnifiedHighlighter.Builder.
  
  Parameters:
  
  builder - - a UnifiedHighlighter.Builder object.
Method Details
- setHandleMultiTermQuery
  
  @Deprecated public void setHandleMultiTermQuery(boolean handleMtq)
  
  Deprecated.
- setHighlightPhrasesStrictly
  
  @Deprecated public void setHighlightPhrasesStrictly(boolean highlightPhrasesStrictly)
  
  Deprecated.
- setPassageRelevancyOverSpeed
  
  @Deprecated public void setPassageRelevancyOverSpeed(boolean passageRelevancyOverSpeed)
  
  Deprecated.
- setMaxLength
  
  @Deprecated public void setMaxLength(int maxLength)
  
  Deprecated.
- setBreakIterator
  
  @Deprecated public void setBreakIterator(Supplier<BreakIterator> breakIterator)
  
  Deprecated.
- setScorer
  
  @Deprecated public void setScorer(PassageScorer scorer)
  
  Deprecated.
- setFormatter
  
  @Deprecated public void setFormatter(PassageFormatter formatter)
  
  Deprecated.
- setMaxNoHighlightPassages
  
  @Deprecated public void setMaxNoHighlightPassages(int defaultMaxNoHighlightPassages)
  
  Deprecated.
- setCacheFieldValCharsThreshold
  
  @Deprecated public void setCacheFieldValCharsThreshold(int cacheFieldValCharsThreshold)
  
  Deprecated.
- setFieldMatcher
  
  @Deprecated public void setFieldMatcher(Predicate<String> predicate)
  
  Deprecated.
- setWeightMatches
  
  @Deprecated public void setWeightMatches(boolean weightMatches)
  
  Deprecated.
- shouldHandleMultiTermQuery
  
  @Deprecated protected boolean shouldHandleMultiTermQuery(String field)
  
  Deprecated.
  
  Returns whether MultiTermQuery derivatives will be highlighted. By default it's enabled. MTQ highlighting can be expensive, particularly when using offsets in postings.
- shouldHighlightPhrasesStrictly
  
  @Deprecated protected boolean shouldHighlightPhrasesStrictly(String field)
  
  Deprecated.
  
  Returns whether position sensitive queries (e.g. phrases and SpanQueryies) should be highlighted strictly based on query matches (slower) versus any/all occurrences of the underlying terms. By default it's enabled, but there's no overhead if such queries aren't used.
- shouldPreferPassageRelevancyOverSpeed
  
  @Deprecated protected boolean shouldPreferPassageRelevancyOverSpeed(String field)
  
  Deprecated.
- builder
  
  public static UnifiedHighlighter.Builder builder(IndexSearcher searcher, Analyzer indexAnalyzer)
  
  Creates a UnifiedHighlighter.Builder object where IndexSearcher and Analyzer are not null.
  
  Parameters:
  
  searcher - - a IndexSearcher object.
  
  indexAnalyzer - - a Analyzer object.
  
  Returns:
  
  a UnifiedHighlighter.Builder object
- builderWithoutSearcher
  
  public static UnifiedHighlighter.Builder builderWithoutSearcher(Analyzer indexAnalyzer)
  
  Creates a UnifiedHighlighter.Builder object in which you can only use highlightWithoutSearcher(String, Query, String, int) for highlighting.
  
  Parameters:
  
  indexAnalyzer - - a Analyzer object.
  
  Returns:
  
  a UnifiedHighlighter.Builder object
- extractTerms
  
  protected static Set<Term> extractTerms(Query query) throws IOException
  
  Extracts matching terms after rewriting against an empty index
  
  Throws:
  
  IOException
- evaluateFlags
  
  protected Set<UnifiedHighlighter.HighlightFlag> evaluateFlags(boolean shouldHandleMultiTermQuery, boolean shouldHighlightPhrasesStrictly, boolean shouldPassageRelevancyOverSpeed, boolean shouldEnableWeightMatches)
  
  This method returns the set of of UnifiedHighlighter.HighlightFlags, which will be applied to the UH object. The output depends on the values provided to UnifiedHighlighter.Builder.withHandleMultiTermQuery(boolean), UnifiedHighlighter.Builder.withHighlightPhrasesStrictly(boolean), UnifiedHighlighter.Builder.withPassageRelevancyOverSpeed(boolean) and UnifiedHighlighter.Builder.withWeightMatches(boolean) OR setHandleMultiTermQuery(boolean), setHighlightPhrasesStrictly(boolean), setPassageRelevancyOverSpeed(boolean) and setWeightMatches(boolean)
  
  Parameters:
  
  shouldHandleMultiTermQuery - - flag for adding Multi-term query
  
  shouldHighlightPhrasesStrictly - - flag for adding phrase highlighting
  
  shouldPassageRelevancyOverSpeed - - flag for adding passage relevancy
  
  shouldEnableWeightMatches - - flag for enabling weight matches
  
  Returns:
  
  a set of UnifiedHighlighter.HighlightFlags.
- evaluateFlags
  
  protected Set<UnifiedHighlighter.HighlightFlag> evaluateFlags(UnifiedHighlighter.Builder uhBuilder)
  
  Evaluate the highlight flags and set the flags variable. This is called only once when the Builder object is used to create a UH object.
  
  Parameters:
  
  uhBuilder - - UnifiedHighlighter.Builder object.
  
  Returns:
  
  UnifiedHighlighter.HighlightFlags.
- evaluateFlags
  
  @Deprecated protected Set<UnifiedHighlighter.HighlightFlag> evaluateFlags(UnifiedHighlighter uh)
  
  Deprecated.
  
  Evaluate the highlight flags and set the flags variable. This is called every time getFlags(String) method is called. This is used in the builder and has been marked deprecated since it is used only for the mutable initialization of a UH object.
  
  Parameters:
  
  uh - - UnifiedHighlighter object.
  
  Returns:
  
  UnifiedHighlighter.HighlightFlags.
- getFieldMatcher
  
  protected Predicate<String> getFieldMatcher(String field)
  
  Returns the predicate to use for extracting the query part that must be highlighted. By default only queries that target the current field are kept. (AKA requireFieldMatch)
- getFlags
  
  protected Set<UnifiedHighlighter.HighlightFlag> getFlags(String field)
  
  Returns the UnifiedHighlighter.HighlightFlags applicable for the current UH instance.
- getMaxLength
  
  public int getMaxLength()
  
  The maximum content size to process. Content will be truncated to this size before highlighting. Typically snippets closer to the beginning of the document better summarize its content.
- getBreakIterator
  
  protected BreakIterator getBreakIterator(String field)
  
  Returns the BreakIterator to use for dividing text into passages. This returns BreakIterator.getSentenceInstance(Locale) by default; subclasses can override to customize.
  Note: this highlighter will call BreakIterator.preceding(int) and BreakIterator.next() many times on it. The default generic JDK implementation of preceding performs poorly.
- getScorer
  
  protected PassageScorer getScorer(String field)
  
  Returns the PassageScorer to use for ranking passages. This returns a new PassageScorer by default; subclasses can override to customize.
- getFormatter
  
  protected PassageFormatter getFormatter(String field)
  
  Returns the PassageFormatter to use for formatting passages into highlighted snippets. This returns a new PassageFormatter by default; subclasses can override to customize.
- getMaxNoHighlightPassages
  
  protected int getMaxNoHighlightPassages(String field)
  
  Returns the number of leading passages (as delineated by the BreakIterator) when no highlights could be found. If it's less than 0 (the default) then this defaults to the maxPassages parameter given for each request. If this is 0 then the resulting highlight is null (not formatted).
- getCacheFieldValCharsThreshold
  
  public int getCacheFieldValCharsThreshold()
  
  Limits the amount of field value pre-fetching until this threshold is passed. The highlighter internally highlights in batches of documents sized on the sum field value length (in chars) of the fields to be highlighted (bounded by getMaxLength() for each field). By setting this to 0, you can force documents to be fetched and highlighted one at a time, which you usually shouldn't do. The default is 524288 chars which translates to about a megabyte. However, note that the highlighter sometimes ignores this and highlights one document at a time (without caching a bunch of documents in advance) when it can detect there's no point in it -- such as when all fields will be highlighted via re-analysis as one example.
- getIndexSearcher
  
  public IndexSearcher getIndexSearcher()
  
  ... as passed in from constructor.
- getIndexAnalyzer
  
  public Analyzer getIndexAnalyzer()
  
  ... as passed in from constructor.
- getOffsetSource
  
  protected UnifiedHighlighter.OffsetSource getOffsetSource(String field)
  Determine the offset source for the specified field. The default algorithm is as follows:
  
  This calls getFieldInfo(String). Note this returns null if there is no searcher or if the field isn't found there.
  If there's a field info it has IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS then UnifiedHighlighter.OffsetSource.POSTINGS is returned.
  If there's a field info and FieldInfo.hasVectors() then UnifiedHighlighter.OffsetSource.TERM_VECTORS is returned (note we can't check here if the TV has offsets; if there isn't then an exception will get thrown down the line).
  Fall-back: UnifiedHighlighter.OffsetSource.ANALYSIS is returned.
  
  Note that the highlighter sometimes switches to something else based on the query, such as if you have UnifiedHighlighter.OffsetSource.POSTINGS_WITH_TERM_VECTORS but in fact don't need term vectors.
- getFieldInfo
  
  protected FieldInfo getFieldInfo(String field)
  
  Called by the default implementation of getOffsetSource(String). If there is no searcher then we simply always return null.
- highlight
  
  public String[] highlight(String field, Query query, TopDocs topDocs) throws IOException
  
  Highlights the top passages from a single field.
  
  Parameters:
  
  field - field name to highlight. Must have a stored string value and also be indexed with offsets.
  
  query - query to highlight.
  
  topDocs - TopDocs containing the summary result documents to highlight.
  
  Returns:
  
  Array of formatted snippets corresponding to the documents in topDocs. If no highlights were found for a document, the first sentence for the field will be returned.
  
  Throws:
  
  IOException - if an I/O error occurred during processing
  
  IllegalArgumentException - if field was indexed without IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS
- highlight
  
  public String[] highlight(String field, Query query, TopDocs topDocs, int maxPassages) throws IOException
  
  Highlights the top-N passages from a single field.
  
  Parameters:
  
  field - field name to highlight. Must have a stored string value.
  
  query - query to highlight.
  
  topDocs - TopDocs containing the summary result documents to highlight.
  
  maxPassages - The maximum number of top-N ranked passages used to form the highlighted snippets.
  
  Returns:
  
  Array of formatted snippets corresponding to the documents in topDocs. If no highlights were found for a document, the first maxPassages sentences from the field will be returned.
  
  Throws:
  
  IOException - if an I/O error occurred during processing
  
  IllegalArgumentException - if field was indexed without IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS
- highlightFields
  
  public Map<String,String[]> highlightFields(String[] fields, Query query, TopDocs topDocs) throws IOException
  Highlights the top passages from multiple fields.
  Conceptually, this behaves as a more efficient form of:
  Map m = new HashMap(); for (String field : fields) { m.put(field, highlight(field, query, topDocs)); } return m;
  Parameters:
  
  fields - field names to highlight. Must have a stored string value.
  
  query - query to highlight.
  
  topDocs - TopDocs containing the summary result documents to highlight.
  
  Returns:
  
  Map keyed on field name, containing the array of formatted snippets corresponding to the documents in topDocs. If no highlights were found for a document, the first sentence from the field will be returned.
  
  Throws:
  
  IOException - if an I/O error occurred during processing
  
  IllegalArgumentException - if field was indexed without IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS
- highlightFields
  
  public Map<String,String[]> highlightFields(String[] fields, Query query, TopDocs topDocs, int[] maxPassages) throws IOException
  Highlights the top-N passages from multiple fields.
  Conceptually, this behaves as a more efficient form of:
  Map m = new HashMap(); for (String field : fields) { m.put(field, highlight(field, query, topDocs, maxPassages)); } return m;
  Parameters:
  
  fields - field names to highlight. Must have a stored string value.
  
  query - query to highlight.
  
  topDocs - TopDocs containing the summary result documents to highlight.
  
  maxPassages - The maximum number of top-N ranked passages per-field used to form the highlighted snippets.
  
  Returns:
  
  Map keyed on field name, containing the array of formatted snippets corresponding to the documents in topDocs. If no highlights were found for a document, the first maxPassages sentences from the field will be returned.
  
  Throws:
  
  IOException - if an I/O error occurred during processing
  
  IllegalArgumentException - if field was indexed without IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS
- highlightFields
  
  public Map<String,String[]> highlightFields(String[] fieldsIn, Query query, int[] docidsIn, int[] maxPassagesIn) throws IOException
  
  Highlights the top-N passages from multiple fields, for the provided int[] docids.
  
  Parameters:
  
  fieldsIn - field names to highlight. Must have a stored string value.
  
  query - query to highlight.
  
  docidsIn - containing the document IDs to highlight.
  
  maxPassagesIn - The maximum number of top-N ranked passages per-field used to form the highlighted snippets.
  
  Returns:
  
  Map keyed on field name, containing the array of formatted snippets corresponding to the documents in docidsIn. If no highlights were found for a document, the first maxPassages from the field will be returned.
  
  Throws:
  
  IOException - if an I/O error occurred during processing
  
  IllegalArgumentException - if field was indexed without IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS
- highlightFieldsAsObjects
  
  protected Map<String,Object[]> highlightFieldsAsObjects(String[] fieldsIn, Query query, int[] docIdsIn, int[] maxPassagesIn) throws IOException
  
  Expert: highlights the top-N passages from multiple fields, for the provided int[] docids, to custom Object as returned by the PassageFormatter. Use this API to render to something other than String.
  
  Parameters:
  
  fieldsIn - field names to highlight. Must have a stored string value.
  
  query - query to highlight.
  
  docIdsIn - containing the document IDs to highlight.
  
  maxPassagesIn - The maximum number of top-N ranked passages per-field used to form the highlighted snippets.
  
  Returns:
  
  Map keyed on field name, containing the array of formatted snippets corresponding to the documents in docIdsIn. If no highlights were found for a document, the first maxPassages from the field will be returned.
  
  Throws:
  
  IOException - if an I/O error occurred during processing
  
  IllegalArgumentException - if field was indexed without IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS
- highlightWithoutSearcher
  
  public Object highlightWithoutSearcher(String field, Query query, String content, int maxPassages) throws IOException
  
  Highlights text passed as a parameter. This requires the IndexSearcher provided to this highlighter is null. This use-case is more rare. Naturally, the mode of operation will be UnifiedHighlighter.OffsetSource.ANALYSIS. The result of this method is whatever the PassageFormatter returns. For the DefaultPassageFormatter and assuming content has non-zero length, the result will be a non-null string -- so it's safe to call Object.toString() on it in that case.
  
  Parameters:
  
  field - field name to highlight (as found in the query).
  
  query - query to highlight.
  
  content - text to highlight.
  
  maxPassages - The maximum number of top-N ranked passages used to form the highlighted snippets.
  
  Returns:
  
  result of the PassageFormatter -- probably a String. Might be null.
  
  Throws:
  
  IOException - if an I/O error occurred during processing
- getFieldHighlighter
  
  protected FieldHighlighter getFieldHighlighter(String field, Query query, Set<Term> allTerms, int maxPassages)
- getHighlightComponents
  
  protected UHComponents getHighlightComponents(String field, Query query, Set<Term> allTerms)
- hasUnrecognizedQuery
  
  protected boolean hasUnrecognizedQuery(Predicate<String> fieldMatcher, Query query)
- filterExtractedTerms
  
  protected static BytesRef[] filterExtractedTerms(Predicate<String> fieldMatcher, Set<Term> queryTerms)
- getPhraseHelper
  
  protected PhraseHelper getPhraseHelper(String field, Query query, Set<UnifiedHighlighter.HighlightFlag> highlightFlags)
- getAutomata
  
  protected LabelledCharArrayMatcher[] getAutomata(String field, Query query, Set<UnifiedHighlighter.HighlightFlag> highlightFlags)
- getOptimizedOffsetSource
  
  protected UnifiedHighlighter.OffsetSource getOptimizedOffsetSource(UHComponents components)
- getOffsetStrategy
  
  protected FieldOffsetStrategy getOffsetStrategy(UnifiedHighlighter.OffsetSource offsetSource, UHComponents components)
- requiresRewrite
  
  protected Boolean requiresRewrite(SpanQuery spanQuery)
  
  When highlighting phrases accurately, we need to know which SpanQuery's need to have Query.rewrite(IndexReader) called on them. It helps performance to avoid it if it's not needed. This method will be invoked on all SpanQuery instances recursively. If you have custom SpanQuery queries then override this to check instanceof and provide a definitive answer. If the query isn't your custom one, simply return null to have the default rules apply, which govern the ones included in Lucene.
- preSpanQueryRewrite
  
  protected Collection<Query> preSpanQueryRewrite(Query query)
  
  When highlighting phrases accurately, we may need to handle custom queries that aren't supported in the WeightedSpanTermExtractor as called by the PhraseHelper. Should custom query types be needed, this method should be overriden to return a collection of queries if appropriate, or null if nothing to do. If the query is not custom, simply returning null will allow the default rules to apply.
  
  Parameters:
  
  query - Query to be highlighted
  
  Returns:
  
  A Collection of Query object(s) if needs to be rewritten, otherwise null.
- loadFieldValues
  
  protected List<CharSequence[]> loadFieldValues(String[] fields, DocIdSetIterator docIter, int cacheCharsThreshold) throws IOException
  
  Loads the String values for each docId by field to be highlighted. By default this loads from stored fields by the same name as given, but a subclass can change the source. The returned Strings must be identical to what was indexed (at least for postings or term-vectors offset sources). This method must load fields for at least one document from the given DocIdSetIterator but need not return all of them; by default the character lengths are summed and this method will return early when cacheCharsThreshold is exceeded. Specifically if that number is 0, then only one document is fetched no matter what. Values in the array of CharSequence will be null if no value was found.
  
  Throws:
  
  IOException
- newLimitedStoredFieldsVisitor
  
  protected UnifiedHighlighter.LimitedStoredFieldVisitor newLimitedStoredFieldsVisitor(String[] fields)
  
  NOTE: This API is for internal purposes only and might change in incompatible ways in the next release.

Class UnifiedHighlighter

Nested Class Summary

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Details

MULTIVAL_SEP_CHAR

DEFAULT_MAX_LENGTH

DEFAULT_CACHE_CHARS_THRESHOLD

ZERO_LEN_AUTOMATA_ARRAY

searcher

indexAnalyzer

fieldInfos

Constructor Details

UnifiedHighlighter

UnifiedHighlighter

Method Details

setHandleMultiTermQuery

setHighlightPhrasesStrictly

setPassageRelevancyOverSpeed

setMaxLength

setBreakIterator

setScorer

setFormatter

setMaxNoHighlightPassages

setCacheFieldValCharsThreshold

setFieldMatcher

setWeightMatches

shouldHandleMultiTermQuery

shouldHighlightPhrasesStrictly

shouldPreferPassageRelevancyOverSpeed

builder

builderWithoutSearcher

extractTerms

evaluateFlags

evaluateFlags

evaluateFlags

getFieldMatcher

getFlags

getMaxLength

getBreakIterator

getScorer

getFormatter

getMaxNoHighlightPassages

getCacheFieldValCharsThreshold

getIndexSearcher

getIndexAnalyzer

getOffsetSource

getFieldInfo

highlight

highlight

highlightFields

highlightFields

highlightFields

highlightFieldsAsObjects

highlightWithoutSearcher

getFieldHighlighter

getHighlightComponents

hasUnrecognizedQuery

filterExtractedTerms

getPhraseHelper

getAutomata

getOptimizedOffsetSource

getOffsetStrategy

requiresRewrite

preSpanQueryRewrite

loadFieldValues

newLimitedStoredFieldsVisitor