Package org.apache.lucene.queries.spans
The calculus of spans.
A span is a <doc,startPosition,endPosition>
tuple that is enumerated by
class Spans
.
The following span query operators are implemented:
- A
SpanTermQuery
matches all spans containing a particularTerm
. This should not be used for terms that are indexed at position Integer.MAX_VALUE. - A
SpanNearQuery
matches spans which occur near one another, and can be used to implement things like phrase search (when constructed fromSpanTermQuery
s) and inter-phrase proximity (when constructed from otherSpanNearQuery
s). - A
SpanWithinQuery
matches spans which occur inside of another spans. - A
SpanContainingQuery
matches spans which contain another spans. - A
SpanOrQuery
merges spans from a number of otherSpanQuery
s. - A
SpanNotQuery
removes spans matching oneSpanQuery
which overlap (or comes near) another. This can be used, e.g., to implement within-paragraph search. - A
SpanFirstQuery
matches spans matchingq
whose end position is less thann
. This can be used to constrain matches to the first part of the document. - A
SpanPositionRangeQuery
is a more general form of SpanFirstQuery that can constrain matches to arbitrary portions of the document.
For example, a span query which matches "John Kerry" within ten words of "George Bush" within the first 100 words of the document could be constructed with:
SpanQuery john = new SpanTermQuery(new Term("content", "john")); SpanQuery kerry = new SpanTermQuery(new Term("content", "kerry")); SpanQuery george = new SpanTermQuery(new Term("content", "george")); SpanQuery bush = new SpanTermQuery(new Term("content", "bush")); SpanQuery johnKerry = new SpanNearQuery(new SpanQuery[] {john, kerry}, 0, true); SpanQuery georgeBush = new SpanNearQuery(new SpanQuery[] {george, bush}, 0, true); SpanQuery johnKerryNearGeorgeBush = new SpanNearQuery(new SpanQuery[] {johnKerry, georgeBush}, 10, false); SpanQuery johnKerryNearGeorgeBushAtStart = new SpanFirstQuery(johnKerryNearGeorgeBush, 100);
Span queries may be freely intermixed with other Lucene queries. So, for example, the above query can be restricted to documents which also use the word "iraq" with:
Query query = new BooleanQuery(); query.add(johnKerryNearGeorgeBushAtStart, true, false); query.add(new TermQuery("content", "iraq"), true, false);
-
Interface Summary Interface Description SpanCollector An interface defining the collection of postings information from the leaves of aSpans
-
Class Summary Class Description FieldMaskingSpanQuery Wrapper to allowSpanQuery
objects participate in composite single-field SpanQueries by 'lying' about their search field.FilterSpans ASpans
implementation wrapping another spans instance, allowing to filter spans matches easily by implementingFilterSpans.accept(org.apache.lucene.queries.spans.Spans)
NearSpansOrdered A Spans that is formed from the ordered subspans of a SpanNearQuery where the subspans do not overlap and have a maximum slop between them.NearSpansUnordered Similar toNearSpansOrdered
, but for the unordered case.SpanContainingQuery Keep matches that contain another SpanScorer.SpanDisiWrapper Wrapper used inSpanDisiPriorityQueue
.SpanFirstQuery Matches spans near the beginning of a field.SpanMultiTermQueryWrapper<Q extends MultiTermQuery> Wraps anyMultiTermQuery
as aSpanQuery
, so it can be nested within other SpanQuery classes.SpanMultiTermQueryWrapper.SpanRewriteMethod Abstract class that defines how the query is rewritten.SpanMultiTermQueryWrapper.TopTermsSpanBooleanQueryRewrite A rewrite method that first translates each term into a SpanTermQuery in aBooleanClause.Occur.SHOULD
clause in a BooleanQuery, and keeps the scores as computed by the query.SpanNearQuery Matches spans which are near one another.SpanNearQuery.Builder A builder for SpanNearQueriesSpanNotQuery Removes matches which overlap with another SpanQuery or which are within x tokens before or y tokens after another SpanQuery.SpanOrQuery Matches the union of its clauses.SpanPositionCheckQuery Base class for filtering a SpanQuery based on the position of a match.SpanPositionRangeQuery Checks to see if theSpanPositionCheckQuery.getMatch()
lies between a start and end positionSpanQuery Base class for span-based queries.Spans Iterates through combinations of start/end positions per-doc.SpanScorer SpanTermQuery Matches spans containing a term.SpanWeight Expert-only.SpanWithinQuery Keep matches that are contained within another Spans.TermSpans Expert: Public for extension only. -
Enum Summary Enum Description FilterSpans.AcceptStatus Status returned fromFilterSpans.accept(Spans)
that indicates whether a candidate match should be accepted, rejected, or rejected and move on to the next document.SpanWeight.Postings Enumeration defining what postings information should be retrieved from the index for a given Spans