PostingsHighlighter (Lucene 4.10.0 API)

java.lang.Object
- org.apache.lucene.search.postingshighlight.PostingsHighlighter

```
public class PostingsHighlighter
extends Object
```
Simple highlighter that does not analyze fields nor use term vectors. Instead it requires FieldInfo.IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS.
PostingsHighlighter treats the single original document as the whole corpus, and then scores individual passages as if they were documents in this corpus. It uses a BreakIterator to find passages in the text; by default it breaks using getSentenceInstance(Locale.ROOT). It then iterates in parallel (merge sorting by offset) through the positions of all terms from the query, coalescing those hits that occur in a single passage into a Passage, and then scores each Passage using a separate PassageScorer. Passages are finally formatted into highlighted snippets with a PassageFormatter.
You can customize the behavior by subclassing this highlighter, some important hooks:
- getBreakIterator(String): Customize how the text is divided into passages.
- getScorer(String): Customize how passages are ranked.
- getFormatter(String): Customize how snippets are formatted.
- getIndexAnalyzer(String): Enable highlighting of MultiTermQuerys such as WildcardQuery.
WARNING: The code is very new and probably still has some exciting bugs!
Example usage:
```
   // configure field with offsets at index time
   FieldType offsetsType = new FieldType(TextField.TYPE_STORED);
   offsetsType.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
   Field body = new Field("body", "foobar", offsetsType);

   // retrieve highlights at query time 
   PostingsHighlighter highlighter = new PostingsHighlighter();
   Query query = new TermQuery(new Term("body", "highlighting"));
   TopDocs topDocs = searcher.search(query, n);
   String highlights[] = highlighter.highlight("body", query, searcher, topDocs);
 
```
This is thread-safe, and can be used across different readers.
WARNING: This API is experimental and might change in incompatible ways in the next release.

Field Summary

Fields
Modifier and Type Field and Description

static int DEFAULT_MAX_LENGTH
Default maximum content size to process.

Fields
Modifier and Type	Field and Description
`static int`	`DEFAULT_MAX_LENGTH` Default maximum content size to process.

Constructor Summary

Constructors
Constructor and Description
`PostingsHighlighter()` Creates a new highlighter with `DEFAULT_MAX_LENGTH`.
`PostingsHighlighter(int maxLength)` Creates a new highlighter, specifying maximum content length.

Method Summary

Methods
Modifier and Type	Method and Description
`protected BreakIterator`	`getBreakIterator(String field)` Returns the `BreakIterator` to use for dividing text into passages.
`protected Passage[]`	`getEmptyHighlight(String fieldName, BreakIterator bi, int maxPassages)` Called to summarize a document when no hits were found.
`protected PassageFormatter`	`getFormatter(String field)` Returns the `PassageFormatter` to use for formatting passages into highlighted snippets.
`protected Analyzer`	`getIndexAnalyzer(String field)` Returns the analyzer originally used to index the content for `field`.
`protected char`	`getMultiValuedSeparator(String field)` Returns the logical separator between values for multi-valued fields.
`protected PassageScorer`	`getScorer(String field)` Returns the `PassageScorer` to use for ranking passages.
`String[]`	`highlight(String field, Query query, IndexSearcher searcher, TopDocs topDocs)` Highlights the top passages from a single field.
`String[]`	`highlight(String field, Query query, IndexSearcher searcher, TopDocs topDocs, int maxPassages)` Highlights the top-N passages from a single field.
`Map<String,String[]>`	`highlightFields(String[] fieldsIn, Query query, IndexSearcher searcher, int[] docidsIn, int[] maxPassagesIn)` Highlights the top-N passages from multiple fields, for the provided int[] docids.
`Map<String,String[]>`	`highlightFields(String[] fields, Query query, IndexSearcher searcher, TopDocs topDocs)` Highlights the top passages from multiple fields.
`Map<String,String[]>`	`highlightFields(String[] fields, Query query, IndexSearcher searcher, TopDocs topDocs, int[] maxPassages)` Highlights the top-N passages from multiple fields.
`protected Map<String,Object[]>`	`highlightFieldsAsObjects(String[] fieldsIn, Query query, IndexSearcher searcher, int[] docidsIn, int[] maxPassagesIn)` Expert: highlights the top-N passages from multiple fields, for the provided int[] docids, to custom Object as returned by the `PassageFormatter`.
`protected String[][]`	`loadFieldValues(IndexSearcher searcher, String[] fields, int[] docids, int maxLength)` Loads the String values for each field X docID to be highlighted.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - DEFAULT_MAX_LENGTH
```
public static final int DEFAULT_MAX_LENGTH
```
    Default maximum content size to process. Typically snippets closer to the beginning of the document better summarize its content
    
    See Also:
    Constant Field Values
- Constructor Detail
  - PostingsHighlighter
```
public PostingsHighlighter()
```
    Creates a new highlighter with DEFAULT_MAX_LENGTH.
  - PostingsHighlighter
```
public PostingsHighlighter(int maxLength)
```
    Creates a new highlighter, specifying maximum content length.
    
    Parameters:
    maxLength - maximum content size to process.
    
    Throws:
    
    IllegalArgumentException - if maxLength is negative or Integer.MAX_VALUE
- Method Detail
  - getBreakIterator
```
protected BreakIterator getBreakIterator(String field)
```
    Returns the BreakIterator to use for dividing text into passages. This returns BreakIterator.getSentenceInstance(Locale) by default; subclasses can override to customize.
  - getFormatter
```
protected PassageFormatter getFormatter(String field)
```
    Returns the PassageFormatter to use for formatting passages into highlighted snippets. This returns a new PassageFormatter by default; subclasses can override to customize.
  - getScorer
```
protected PassageScorer getScorer(String field)
```
    Returns the PassageScorer to use for ranking passages. This returns a new PassageScorer by default; subclasses can override to customize.
  - highlight
```
public String[] highlight(String field,
                 Query query,
                 IndexSearcher searcher,
                 TopDocs topDocs)
                   throws IOException
```
    Highlights the top passages from a single field.
    
    Parameters:
    field - field name to highlight. Must have a stored string value and also be indexed with offsets.
    query - query to highlight.
    searcher - searcher that was previously used to execute the query.
    topDocs - TopDocs containing the summary result documents to highlight.
    
    Returns:
    Array of formatted snippets corresponding to the documents in topDocs. If no highlights were found for a document, the first sentence for the field will be returned.
    
    Throws:
    
    IOException - if an I/O error occurred during processing
    
    IllegalArgumentException - if field was indexed without FieldInfo.IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS
  - highlight
```
public String[] highlight(String field,
                 Query query,
                 IndexSearcher searcher,
                 TopDocs topDocs,
                 int maxPassages)
                   throws IOException
```
    Highlights the top-N passages from a single field.
    
    Parameters:
    field - field name to highlight. Must have a stored string value and also be indexed with offsets.
    query - query to highlight.
    searcher - searcher that was previously used to execute the query.
    topDocs - TopDocs containing the summary result documents to highlight.
    maxPassages - The maximum number of top-N ranked passages used to form the highlighted snippets.
    
    Returns:
    Array of formatted snippets corresponding to the documents in topDocs. If no highlights were found for a document, the first maxPassages sentences from the field will be returned.
    
    Throws:
    
    IOException - if an I/O error occurred during processing
    
    IllegalArgumentException - if field was indexed without FieldInfo.IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS
  - highlightFields
```
public Map<String,String[]> highlightFields(String[] fields,
                                   Query query,
                                   IndexSearcher searcher,
                                   TopDocs topDocs)
                                     throws IOException
```
    Highlights the top passages from multiple fields.
    Conceptually, this behaves as a more efficient form of:
```
 Map m = new HashMap();
 for (String field : fields) {
   m.put(field, highlight(field, query, searcher, topDocs));
 }
 return m;
 
```
    Parameters:
    fields - field names to highlight. Must have a stored string value and also be indexed with offsets.
    query - query to highlight.
    searcher - searcher that was previously used to execute the query.
    topDocs - TopDocs containing the summary result documents to highlight.
    
    Returns:
    Map keyed on field name, containing the array of formatted snippets corresponding to the documents in topDocs. If no highlights were found for a document, the first sentence from the field will be returned.
    
    Throws:
    
    IOException - if an I/O error occurred during processing
    
    IllegalArgumentException - if field was indexed without FieldInfo.IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS
  - highlightFields
```
public Map<String,String[]> highlightFields(String[] fields,
                                   Query query,
                                   IndexSearcher searcher,
                                   TopDocs topDocs,
                                   int[] maxPassages)
                                     throws IOException
```
    Highlights the top-N passages from multiple fields.
    Conceptually, this behaves as a more efficient form of:
```
 Map m = new HashMap();
 for (String field : fields) {
   m.put(field, highlight(field, query, searcher, topDocs, maxPassages));
 }
 return m;
 
```
    Parameters:
    fields - field names to highlight. Must have a stored string value and also be indexed with offsets.
    query - query to highlight.
    searcher - searcher that was previously used to execute the query.
    topDocs - TopDocs containing the summary result documents to highlight.
    maxPassages - The maximum number of top-N ranked passages per-field used to form the highlighted snippets.
    
    Returns:
    Map keyed on field name, containing the array of formatted snippets corresponding to the documents in topDocs. If no highlights were found for a document, the first maxPassages sentences from the field will be returned.
    
    Throws:
    
    IOException - if an I/O error occurred during processing
    
    IllegalArgumentException - if field was indexed without FieldInfo.IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS
  - highlightFields
```
public Map<String,String[]> highlightFields(String[] fieldsIn,
                                   Query query,
                                   IndexSearcher searcher,
                                   int[] docidsIn,
                                   int[] maxPassagesIn)
                                     throws IOException
```
    Highlights the top-N passages from multiple fields, for the provided int[] docids.
    
    Parameters:
    fieldsIn - field names to highlight. Must have a stored string value and also be indexed with offsets.
    query - query to highlight.
    searcher - searcher that was previously used to execute the query.
    docidsIn - containing the document IDs to highlight.
    maxPassagesIn - The maximum number of top-N ranked passages per-field used to form the highlighted snippets.
    
    Returns:
    Map keyed on field name, containing the array of formatted snippets corresponding to the documents in docidsIn. If no highlights were found for a document, the first maxPassages from the field will be returned.
    
    Throws:
    
    IOException - if an I/O error occurred during processing
    
    IllegalArgumentException - if field was indexed without FieldInfo.IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS
  - highlightFieldsAsObjects
```
protected Map<String,Object[]> highlightFieldsAsObjects(String[] fieldsIn,
                                            Query query,
                                            IndexSearcher searcher,
                                            int[] docidsIn,
                                            int[] maxPassagesIn)
                                                 throws IOException
```
    Expert: highlights the top-N passages from multiple fields, for the provided int[] docids, to custom Object as returned by the PassageFormatter. Use this API to render to something other than String.
    
    Parameters:
    fieldsIn - field names to highlight. Must have a stored string value and also be indexed with offsets.
    query - query to highlight.
    searcher - searcher that was previously used to execute the query.
    docidsIn - containing the document IDs to highlight.
    maxPassagesIn - The maximum number of top-N ranked passages per-field used to form the highlighted snippets.
    
    Returns:
    Map keyed on field name, containing the array of formatted snippets corresponding to the documents in docidsIn. If no highlights were found for a document, the first maxPassages from the field will be returned.
    
    Throws:
    
    IOException - if an I/O error occurred during processing
    
    IllegalArgumentException - if field was indexed without FieldInfo.IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS
  - loadFieldValues
```
protected String[][] loadFieldValues(IndexSearcher searcher,
                         String[] fields,
                         int[] docids,
                         int maxLength)
                              throws IOException
```
    Loads the String values for each field X docID to be highlighted. By default this loads from stored fields, but a subclass can change the source. This method should allocate the String[fields.length][docids.length] and fill all values. The returned Strings must be identical to what was indexed.
    
    Throws:
    
    IOException
  - getMultiValuedSeparator
```
protected char getMultiValuedSeparator(String field)
```
    Returns the logical separator between values for multi-valued fields. The default value is a space character, which means passages can span across values, but a subclass can override, for example with U+2029 PARAGRAPH SEPARATOR (PS) if each value holds a discrete passage for highlighting.
  - getIndexAnalyzer
```
protected Analyzer getIndexAnalyzer(String field)
```
    Returns the analyzer originally used to index the content for field.
    This is used to highlight some MultiTermQueries.
    
    Returns:
    Analyzer or null (the default, meaning no special multi-term processing)
  - getEmptyHighlight
```
protected Passage[] getEmptyHighlight(String fieldName,
                          BreakIterator bi,
                          int maxPassages)
```
    Called to summarize a document when no hits were found. By default this just returns the first maxPassages sentences; subclasses can override to customize.

Class PostingsHighlighter

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

DEFAULT_MAX_LENGTH

Constructor Detail

PostingsHighlighter

PostingsHighlighter

Method Detail

getBreakIterator

getFormatter

getScorer

highlight

highlight

highlightFields

highlightFields

highlightFields

highlightFieldsAsObjects

loadFieldValues

getMultiValuedSeparator

getIndexAnalyzer

getEmptyHighlight