public final class Intervals extends Object
IntervalsSource
types
These sources implement minimum-interval algorithms taken from the paper
Efficient Optimally Lazy Algorithms for Minimal-Interval Semantics
By default, sources that are sensitive to internal gaps (e.g. PHRASE and MAXGAPS) will
rewrite their sub-sources so that disjunctions of different lengths are pulled up
to the top of the interval tree. For example, PHRASE(or(PHRASE("a", "b", "c"), "b"), "c")
will automatically rewrite itself to OR(PHRASE("a", "b", "c", "c"), PHRASE("b", "c"))
to ensure that documents containing "b c" are matched. This can lead to less efficient
queries, as more terms need to be loaded (for example, the "c" iterator above is loaded
twice), so if you care more about speed than about accuracy you can use the
or(boolean, IntervalsSource...)
factory method to prevent rewriting.Modifier and Type | Method and Description |
---|---|
static IntervalsSource |
after(IntervalsSource source,
IntervalsSource reference)
Returns intervals from the source that appear after intervals from the reference
|
static IntervalsSource |
atLeast(int minShouldMatch,
IntervalsSource... sources)
Return intervals that span combinations of intervals from
minShouldMatch of the sources |
static IntervalsSource |
before(IntervalsSource source,
IntervalsSource reference)
Returns intervals from the source that appear before intervals from the reference
|
static IntervalsSource |
containedBy(IntervalsSource small,
IntervalsSource big)
Create a contained-by
IntervalsSource
Returns intervals from the small query that appear within intervals of the big query |
static IntervalsSource |
containing(IntervalsSource big,
IntervalsSource small)
Create a containing
IntervalsSource
Returns intervals from the big source that contain one or more intervals from
the small source |
static IntervalsSource |
extend(IntervalsSource source,
int before,
int after)
Create an
IntervalsSource that wraps another source, extending its
intervals by a number of positions before and after. |
static IntervalsSource |
fixField(String field,
IntervalsSource source)
Create an
IntervalsSource that always returns intervals from a specific field
This is useful for comparing intervals across multiple fields, for example fields that
have been analyzed differently, allowing you to search for stemmed terms near unstemmed
terms, etc. |
static IntervalsSource |
maxgaps(int gaps,
IntervalsSource subSource)
Create an
IntervalsSource that filters a sub-source by its gaps |
static IntervalsSource |
maxwidth(int width,
IntervalsSource subSource)
Create an
IntervalsSource that filters a sub-source by the width of its intervals |
static IntervalsSource |
multiterm(Automaton automaton,
int maxExpansions,
String pattern)
Deprecated.
|
static IntervalsSource |
multiterm(Automaton automaton,
String pattern)
Deprecated.
|
static IntervalsSource |
multiterm(CompiledAutomaton ca,
int maxExpansions,
String pattern)
Expert: Return an
IntervalsSource over the disjunction of all terms that's accepted by the given automaton
WARNING: Setting maxExpansions to higher than the default value of 128
can be both slow and memory-intensive |
static IntervalsSource |
multiterm(CompiledAutomaton ca,
String pattern)
Expert: Return an
IntervalsSource over the disjunction of all terms that's accepted by the given automaton |
static IntervalsSource |
nonOverlapping(IntervalsSource minuend,
IntervalsSource subtrahend)
Create a non-overlapping IntervalsSource
Returns intervals of the minuend that do not overlap with intervals from the subtrahend
|
static IntervalsSource |
notContainedBy(IntervalsSource small,
IntervalsSource big)
Create a not-contained-by
IntervalsSource
Returns intervals from the small IntervalsSource that do not appear within
intervals from the big IntervalsSource . |
static IntervalsSource |
notContaining(IntervalsSource minuend,
IntervalsSource subtrahend)
Create a not-containing
IntervalsSource
Returns intervals from the minuend that do not contain intervals of the subtrahend |
static IntervalsSource |
notWithin(IntervalsSource minuend,
int positions,
IntervalsSource subtrahend)
Create a not-within
IntervalsSource
Returns intervals of the minuend that do not appear within a set number of positions of
intervals from the subtrahend query |
static IntervalsSource |
or(boolean rewrite,
IntervalsSource... subSources)
Return an
IntervalsSource over the disjunction of a set of sub-sources |
static IntervalsSource |
or(boolean rewrite,
List<IntervalsSource> subSources)
Return an
IntervalsSource over the disjunction of a set of sub-sources |
static IntervalsSource |
or(IntervalsSource... subSources)
Return an
IntervalsSource over the disjunction of a set of sub-sources
Automatically rewrites if wrapped by an interval source that is sensitive to
internal gaps |
static IntervalsSource |
or(List<IntervalsSource> subSources)
Return an
IntervalsSource over the disjunction of a set of sub-sources |
static IntervalsSource |
ordered(IntervalsSource... subSources)
Create an ordered
IntervalsSource
Returns intervals in which the subsources all appear in the given order |
static IntervalsSource |
overlapping(IntervalsSource source,
IntervalsSource reference)
Returns intervals from a source that overlap with intervals from another source
|
static IntervalsSource |
phrase(IntervalsSource... subSources)
Return an
IntervalsSource exposing intervals for a phrase consisting of a list of IntervalsSources |
static IntervalsSource |
phrase(String... terms)
Return an
IntervalsSource exposing intervals for a phrase consisting of a list of terms |
static IntervalsSource |
prefix(BytesRef prefix)
Return an
IntervalsSource over the disjunction of all terms that begin with a prefix |
static IntervalsSource |
prefix(BytesRef prefix,
int maxExpansions)
Expert: Return an
IntervalsSource over the disjunction of all terms that begin with a prefix
WARNING: Setting maxExpansions to higher than the default value of 128
can be both slow and memory-intensive |
static IntervalsSource |
term(BytesRef term)
Return an
IntervalsSource exposing intervals for a term |
static IntervalsSource |
term(BytesRef term,
Predicate<BytesRef> payloadFilter)
Return an
IntervalsSource exposing intervals for a term, filtered by the value
of the term's payload at each position |
static IntervalsSource |
term(String term)
Return an
IntervalsSource exposing intervals for a term |
static IntervalsSource |
term(String term,
Predicate<BytesRef> payloadFilter)
Return an
IntervalsSource exposing intervals for a term, filtered by the value
of the term's payload at each position |
static IntervalsSource |
unordered(IntervalsSource... subSources)
Create an unordered
IntervalsSource
Returns intervals in which all the subsources appear. |
static IntervalsSource |
unorderedNoOverlaps(IntervalsSource a,
IntervalsSource b)
Create an unordered
IntervalsSource allowing no overlaps between subsources
Returns intervals in which both the subsources appear and do not overlap. |
static IntervalsSource |
wildcard(BytesRef wildcard)
Return an
IntervalsSource over the disjunction of all terms that match a wildcard glob |
static IntervalsSource |
wildcard(BytesRef wildcard,
int maxExpansions)
Expert: Return an
IntervalsSource over the disjunction of all terms that match a wildcard glob
WARNING: Setting maxExpansions to higher than the default value of 128
can be both slow and memory-intensive |
static IntervalsSource |
within(IntervalsSource source,
int positions,
IntervalsSource reference)
Returns intervals of the source that appear within a set number of positions of intervals from
the reference
|
public static IntervalsSource term(BytesRef term)
IntervalsSource
exposing intervals for a termpublic static IntervalsSource term(String term)
IntervalsSource
exposing intervals for a termpublic static IntervalsSource term(String term, Predicate<BytesRef> payloadFilter)
IntervalsSource
exposing intervals for a term, filtered by the value
of the term's payload at each positionpublic static IntervalsSource term(BytesRef term, Predicate<BytesRef> payloadFilter)
IntervalsSource
exposing intervals for a term, filtered by the value
of the term's payload at each positionpublic static IntervalsSource phrase(String... terms)
IntervalsSource
exposing intervals for a phrase consisting of a list of termspublic static IntervalsSource phrase(IntervalsSource... subSources)
IntervalsSource
exposing intervals for a phrase consisting of a list of IntervalsSourcespublic static IntervalsSource or(IntervalsSource... subSources)
IntervalsSource
over the disjunction of a set of sub-sources
Automatically rewrites if wrapped by an interval source that is sensitive to
internal gapspublic static IntervalsSource or(boolean rewrite, IntervalsSource... subSources)
IntervalsSource
over the disjunction of a set of sub-sourcesrewrite
- if false
, do not rewrite intervals that are sensitive to
internal gaps; this may run more efficiently, but can miss valid
hits due to minimizationsubSources
- the sources to combinepublic static IntervalsSource or(List<IntervalsSource> subSources)
IntervalsSource
over the disjunction of a set of sub-sourcespublic static IntervalsSource or(boolean rewrite, List<IntervalsSource> subSources)
IntervalsSource
over the disjunction of a set of sub-sourcesrewrite
- if false
, do not rewrite intervals that are sensitive to
internal gaps; this may run more efficiently, but can miss valid
hits due to minimizationsubSources
- the sources to combinepublic static IntervalsSource prefix(BytesRef prefix)
IntervalsSource
over the disjunction of all terms that begin with a prefixIllegalStateException
- if the prefix expands to more than 128 termspublic static IntervalsSource prefix(BytesRef prefix, int maxExpansions)
IntervalsSource
over the disjunction of all terms that begin with a prefix
WARNING: Setting maxExpansions
to higher than the default value of 128
can be both slow and memory-intensiveprefix
- the prefix to expandmaxExpansions
- the maximum number of terms to expand toIllegalStateException
- if the prefix expands to more than maxExpansions
termspublic static IntervalsSource wildcard(BytesRef wildcard)
IntervalsSource
over the disjunction of all terms that match a wildcard globIllegalStateException
- if the wildcard glob expands to more than 128 termsfor glob format
public static IntervalsSource wildcard(BytesRef wildcard, int maxExpansions)
IntervalsSource
over the disjunction of all terms that match a wildcard glob
WARNING: Setting maxExpansions
to higher than the default value of 128
can be both slow and memory-intensivewildcard
- the glob to expandmaxExpansions
- the maximum number of terms to expand toIllegalStateException
- if the wildcard glob expands to more than maxExpansions
termsfor glob format
@Deprecated public static IntervalsSource multiterm(Automaton automaton, String pattern)
multiterm(CompiledAutomaton, String)
IntervalsSource
over the disjunction of all terms that's accepted by the given automatonautomaton
- accepts terms for to expand topattern
- string representation of the given automaton, mostly used in exception messagesIllegalStateException
- if the automaton accepts more than 128 terms@Deprecated public static IntervalsSource multiterm(Automaton automaton, int maxExpansions, String pattern)
multiterm(CompiledAutomaton, int, String)
IntervalsSource
over the disjunction of all terms that's accepted by the given automaton
WARNING: Setting maxExpansions
to higher than the default value of 128
can be both slow and memory-intensiveautomaton
- accepts terms for to expand tomaxExpansions
- the maximum number of terms to expand topattern
- string representation of the given automaton, mostly used in exception messagesIllegalStateException
- if the automaton accepts more than maxExpansions
termspublic static IntervalsSource multiterm(CompiledAutomaton ca, String pattern)
IntervalsSource
over the disjunction of all terms that's accepted by the given automatonca
- an automaton accepting matching termspattern
- string representation of the given automaton, mostly used in exception messagesIllegalStateException
- if the automaton accepts more than 128 termspublic static IntervalsSource multiterm(CompiledAutomaton ca, int maxExpansions, String pattern)
IntervalsSource
over the disjunction of all terms that's accepted by the given automaton
WARNING: Setting maxExpansions
to higher than the default value of 128
can be both slow and memory-intensiveca
- an automaton accepting matching termsmaxExpansions
- the maximum number of terms to expand topattern
- string representation of the given automaton, mostly used in exception messagesIllegalStateException
- if the automaton accepts more than maxExpansions
termspublic static IntervalsSource maxwidth(int width, IntervalsSource subSource)
IntervalsSource
that filters a sub-source by the width of its intervalswidth
- the maximum width of intervals in the sub-source to filtersubSource
- the sub-source to filterpublic static IntervalsSource maxgaps(int gaps, IntervalsSource subSource)
IntervalsSource
that filters a sub-source by its gapsgaps
- the maximum number of gaps in the sub-source to filtersubSource
- the sub-source to filterpublic static IntervalsSource extend(IntervalsSource source, int before, int after)
IntervalsSource
that wraps another source, extending its
intervals by a number of positions before and after.
This can be useful for adding defined gaps in a block query; for example,
to find 'a b [2 arbitrary terms] c', you can call:
Intervals.phrase(Intervals.term("a"), Intervals.extend(Intervals.term("b"), 0, 2), Intervals.term("c"));Note that calling
IntervalIterator.gaps()
on iterators returned by this source
delegates directly to the wrapped iterator, and does not include the extensions.source
- the source to extendbefore
- how many positions to extend before the delegated intervalafter
- how many positions to extend after the delegated intervalpublic static IntervalsSource ordered(IntervalsSource... subSources)
IntervalsSource
Returns intervals in which the subsources all appear in the given ordersubSources
- an ordered set of IntervalsSource
objectspublic static IntervalsSource unordered(IntervalsSource... subSources)
IntervalsSource
Returns intervals in which all the subsources appear. The subsources may overlapsubSources
- an unordered set of IntervalsSource
spublic static IntervalsSource unorderedNoOverlaps(IntervalsSource a, IntervalsSource b)
IntervalsSource
allowing no overlaps between subsources
Returns intervals in which both the subsources appear and do not overlap.public static IntervalsSource fixField(String field, IntervalsSource source)
IntervalsSource
that always returns intervals from a specific field
This is useful for comparing intervals across multiple fields, for example fields that
have been analyzed differently, allowing you to search for stemmed terms near unstemmed
terms, etc.public static IntervalsSource nonOverlapping(IntervalsSource minuend, IntervalsSource subtrahend)
minuend
- the IntervalsSource
to filtersubtrahend
- the IntervalsSource
to filter bypublic static IntervalsSource overlapping(IntervalsSource source, IntervalsSource reference)
source
- the source to filterreference
- the source to filter bypublic static IntervalsSource notWithin(IntervalsSource minuend, int positions, IntervalsSource subtrahend)
IntervalsSource
Returns intervals of the minuend that do not appear within a set number of positions of
intervals from the subtrahend queryminuend
- the IntervalsSource
to filterpositions
- the minimum distance that intervals from the minuend may occur from intervals
of the subtrahendsubtrahend
- the IntervalsSource
to filter bypublic static IntervalsSource within(IntervalsSource source, int positions, IntervalsSource reference)
source
- the IntervalsSource
to filterpositions
- the maximum distance that intervals of the source may occur from intervals of the referencereference
- the IntervalsSource
to filter bypublic static IntervalsSource notContaining(IntervalsSource minuend, IntervalsSource subtrahend)
IntervalsSource
Returns intervals from the minuend that do not contain intervals of the subtrahendminuend
- the IntervalsSource
to filtersubtrahend
- the IntervalsSource
to filter bypublic static IntervalsSource containing(IntervalsSource big, IntervalsSource small)
IntervalsSource
Returns intervals from the big source that contain one or more intervals from
the small sourcebig
- the IntervalsSource
to filtersmall
- the IntervalsSource
to filter bypublic static IntervalsSource notContainedBy(IntervalsSource small, IntervalsSource big)
IntervalsSource
Returns intervals from the small IntervalsSource
that do not appear within
intervals from the big IntervalsSource
.small
- the IntervalsSource
to filterbig
- the IntervalsSource
to filter bypublic static IntervalsSource containedBy(IntervalsSource small, IntervalsSource big)
IntervalsSource
Returns intervals from the small query that appear within intervals of the big querysmall
- the IntervalsSource
to filterbig
- the IntervalsSource
to filter bypublic static IntervalsSource atLeast(int minShouldMatch, IntervalsSource... sources)
minShouldMatch
of the sourcespublic static IntervalsSource before(IntervalsSource source, IntervalsSource reference)
public static IntervalsSource after(IntervalsSource source, IntervalsSource reference)
Copyright © 2000-2021 Apache Software Foundation. All Rights Reserved.