Package org.apache.lucene.queries.spans
package org.apache.lucene.queries.spans
The calculus of spans.
A span is a <doc,startPosition,endPosition>
tuple that is enumerated by
class Spans
.
The following span query operators are implemented:
- A
SpanTermQuery
matches all spans containing a particularTerm
. This should not be used for terms that are indexed at position Integer.MAX_VALUE. - A
SpanNearQuery
matches spans which occur near one another, and can be used to implement things like phrase search (when constructed fromSpanTermQuery
s) and inter-phrase proximity (when constructed from otherSpanNearQuery
s). - A
SpanWithinQuery
matches spans which occur inside of another spans. - A
SpanContainingQuery
matches spans which contain another spans. - A
SpanOrQuery
merges spans from a number of otherSpanQuery
s. - A
SpanNotQuery
removes spans matching oneSpanQuery
which overlap (or comes near) another. This can be used, e.g., to implement within-paragraph search. - A
SpanFirstQuery
matches spans matchingq
whose end position is less thann
. This can be used to constrain matches to the first part of the document. - A
SpanPositionRangeQuery
is a more general form of SpanFirstQuery that can constrain matches to arbitrary portions of the document.
For example, a span query which matches "John Kerry" within ten words of "George Bush" within the first 100 words of the document could be constructed with:
SpanQuery john = new SpanTermQuery(new Term("content", "john")); SpanQuery kerry = new SpanTermQuery(new Term("content", "kerry")); SpanQuery george = new SpanTermQuery(new Term("content", "george")); SpanQuery bush = new SpanTermQuery(new Term("content", "bush")); SpanQuery johnKerry = new SpanNearQuery(new SpanQuery[] {john, kerry}, 0, true); SpanQuery georgeBush = new SpanNearQuery(new SpanQuery[] {george, bush}, 0, true); SpanQuery johnKerryNearGeorgeBush = new SpanNearQuery(new SpanQuery[] {johnKerry, georgeBush}, 10, false); SpanQuery johnKerryNearGeorgeBushAtStart = new SpanFirstQuery(johnKerryNearGeorgeBush, 100);
Span queries may be freely intermixed with other Lucene queries. So, for example, the above query can be restricted to documents which also use the word "iraq" with:
Query query = new BooleanQuery(); query.add(johnKerryNearGeorgeBushAtStart, true, false); query.add(new TermQuery("content", "iraq"), true, false);
-
ClassDescriptionWrapper to allow
SpanQuery
objects participate in composite single-field SpanQueries by 'lying' about their search field.ASpans
implementation wrapping another spans instance, allowing to filter spans matches easily by implementingFilterSpans.accept(org.apache.lucene.queries.spans.Spans)
Status returned fromFilterSpans.accept(Spans)
that indicates whether a candidate match should be accepted, rejected, or rejected and move on to the next document.A Spans that is formed from the ordered subspans of a SpanNearQuery where the subspans do not overlap and have a maximum slop between them.Similar toNearSpansOrdered
, but for the unordered case.An interface defining the collection of postings information from the leaves of aSpans
Keep matches that contain another SpanScorer.Wrapper used inSpanDisiPriorityQueue
.Matches spans near the beginning of a field.SpanMultiTermQueryWrapper<Q extends MultiTermQuery>Wraps anyMultiTermQuery
as aSpanQuery
, so it can be nested within other SpanQuery classes.Abstract class that defines how the query is rewritten.A rewrite method that first translates each term into a SpanTermQuery in aBooleanClause.Occur.SHOULD
clause in a BooleanQuery, and keeps the scores as computed by the query.Matches spans which are near one another.A builder for SpanNearQueriesRemoves matches which overlap with another SpanQuery or which are within x tokens before or y tokens after another SpanQuery.Matches the union of its clauses.Base class for filtering a SpanQuery based on the position of a match.Checks to see if theSpanPositionCheckQuery.getMatch()
lies between a start and end positionBase class for span-based queries.Iterates through combinations of start/end positions per-doc.Matches spans containing a term.Expert-only.Enumeration defining what postings information should be retrieved from the index for a given SpansKeep matches that are contained within another Spans.Expert: Public for extension only.