All Classes and Interfaces
Class
Description
Provides a base class for analysis based offset strategies to extend from.
A abstract implementation of
FragListBuilder
.Base FragmentsBuilder implementation that supports colored pre/post tags and multivalued fields.
Finds fragment boundaries: pluggable into
BaseFragmentsBuilder
A
BoundaryScanner
implementation that uses BreakIterator
to find boundaries in
the text.A
PassageAdjuster
that adjusts the Passage
range to word boundaries hinted by the
given BreakIterator
.Matches a character array
A
BreakIterator
that breaks the text whenever a certain separator, provided as a
constructor argument, is found.Simple
Encoder
implementation that does not modify the outputCreates a formatted snippet from the top passages.
Encodes original text.
Another highlighter implementation.
FieldFragList has a list of "frag info" that is used by FragmentsBuilder class to create
fragments (snippets).
List of term offsets + weight for a frag info
Represents the list of term offsets for some text
Internal highlighter abstraction that operates on a per field basis.
Ultimately returns an
OffsetsEnum
yielding potentially highlightable words in the text.FieldPhraseList has a list of WeightedPhraseInfo that is used by FragListBuilder to create a
FieldFragList object.
Represents the list of term offsets and boost for some text
Term offsets (start + end)
FieldQuery breaks down query object into terms/phrases and keeps them in a QueryPhraseMap
structure.
Internal structure of a query for highlighting: represents a nested query structure
FieldTermStack
is a stack that keeps query terms in the specified field of the
document to be highlighted.Single term with its position/offsets in the document and IDF weight.
A factory of
MatchHighlighter.FieldValueHighlighter
classes that
cover typical use cases (verbatim values, highlights, abbreviations).Processes terms found in the original text, typically by applying some form of mark-up to
highlight terms in HTML search results pages.
FragListBuilder is an interface for FieldFragList builder classes.
Implements the policy for breaking text into multiple fragments for consideration by the
Highlighter
class.FragmentsBuilder
is an interface for fragments
(snippets) builder classes.Formats text with different color intensity depending on the score of the term.
Marks up highlighted terms found in the best sections of text, using configurable
Fragmenter
, Scorer
, Formatter
, Encoder
and tokenizers.Exception thrown if TokenStream Tokens are incompatible with provided text
Associates a label with a CharArrayMatcher to distinguish different sources for terms in
highlighting
Wraps another
BreakIterator
to skip past breaks that would result in passages that are
too short.An example highlighter that combines several lower-level utility classes in this package into a
fully featured, ready-to-use component.
Single document's highlights.
Actual per-field highlighter.
An
OffsetRange
of a match, together with the source query that caused it.Utility class to compute a list of "match regions" for a given query, searcher and document(s)
using
Matches
API.Access to field values of the highlighted document.
A callback invoked for each document selected by the query.
Uses an
Analyzer
on content to get offsets and then populates a MemoryIndex
.FieldOffsetStrategy that combines offsets from multiple fields.
Never returns offsets.
Fragmenter
implementation which does not fragment the text.This TokenFilter limits the number of tokens while indexing by adding up the current offset.
A non-empty range of offset positions.
An enumeration/iterator of a term and its offsets for use by
FieldHighlighter
.A view over several OffsetsEnum instances, merging them in-place
Based on a
MatchesIterator
; does not look at submatches.Based on a
MatchesIterator
with submatches.Based on a
PostingsEnum
-- the typical/standard OE impl.This strategy retrieves offsets directly from
MatchesIterator
, if they are available,
otherwise it falls back to using OffsetsFromPositions
.This strategy applies to fields with stored positions but no offsets.
This strategy works for fields where we know the match occurred but there are no known positions
or offsets.
This strategy works for fields where we know the match occurred but there are no known positions
or offsets.
Determines how match offset regions are computed from
MatchesIterator
.A per-field supplier of
OffsetsRetrievalStrategy
.Overlays a 2nd LeafReader for the terms of one field, otherwise the primary reader is consulted.
A passage is a fragment of source text, scored and possibly with a list of sub-offsets (markers)
to be highlighted.
Represents a passage (typically a sentence of the document).
Adjusts the range of one or more passages over a given value.
Formats a collection of passages over a given string, cleaning up and
resolving restrictions concerning overlaps, allowed sub-ranges over the input string and length
restrictions.
Creates a formatted snippet from the top passages.
Ranks passages found by
UnifiedHighlighter
.Selects fragments of text that score best for the given set of highlight markers.
Helps the
FieldOffsetStrategy
with position sensitive queries (e.g.Utility class to record Positions Spans
Uses offsets in postings --
IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS
.Like
PostingsOffsetStrategy
but also uses term vectors (only terms needed) for multi-term
queries.Scorer
implementation which scores text fragments by the number of unique query terms
found.Utility class used to extract the terms used in a query, plus any weights.
Scorer
implementation which scores text fragments by the number of unique query terms
found.An implementation of FragmentsBuilder that outputs score-order fragments.
Comparator for
FieldFragList.WeightedFragInfo
by boost, breaking ties by offset.A Scorer is responsible for scoring a stream of tokens.
Simple boundary scanner implementation that divides fragments based on a set of separator
characters.
A simple implementation of
FieldFragList
.A simple implementation of
FragListBuilder
.Fragmenter
implementation which breaks text up into same-size fragments with no concerns
over spotting sentence boundaries.A simple implementation of FragmentsBuilder.
Simple
Encoder
implementation to escape text for HTML outputSimple
Formatter
implementation to highlight terms with a pre and post tag.Fragmenter
implementation which breaks text up into same-size fragments but does not
split up Spans
.An implementation class of
FragListBuilder
that generates one FieldFragList.WeightedFragInfo
object.Formats text with different color intensity depending on the score of the term using the span
tag.
Virtually slices the text on both sides of every occurrence of the specified character.
Wraps a Terms with a
LeafReader
, typically from term vectors.Uses term vectors that contain offsets.
Low-level class used to record information about a section of a document with a score.
One, or several overlapping tokens, along with the score(s) and the scope of the original text.
Convenience methods for obtaining a
TokenStream
for use with the Highlighter
-
can obtain from term vectors with offsets and positions or from an Analyzer re-parsing the stored
content.TokenStream created from a term vector field.
Analyzes the text, producing a single
OffsetsEnum
wrapping the TokenStream
filtered to terms in the query, including wildcards.A parameter object to hold the components a
FieldOffsetStrategy
needs.A Highlighter that can get offsets from either postings (
IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS
), term vectors (FieldType.setStoreTermVectorOffsets(boolean)
), or via re-analyzing text.Builder for UnifiedHighlighter.
Flags for controlling highlighting behavior.
Fetches stored fields for highlighting.
Source of term offsets; essential for highlighting.
A weighted implementation of
FieldFragList
.A weighted implementation of
FragListBuilder
.Lightweight class to hold term, weight, and positions used for scoring this term.
Class used to extract
WeightedSpanTerm
s from a Query
based on whether Term
s from the Query
are contained in a supplied TokenStream
.This class makes sure that if both position sensitive and insensitive versions of the same term
are added, the position insensitive one wins.
Lightweight class to hold term and a weight value used for scoring this term
Just produces one single fragment for the entire text