Package | Description |
---|---|
org.apache.lucene.analysis |
Text analysis.
|
org.apache.lucene.search |
Code to search indices.
|
org.apache.lucene.util.automaton |
Finite-state automaton for regular expressions.
|
Modifier and Type | Method and Description |
---|---|
Automaton |
TokenStreamToAutomaton.toAutomaton(TokenStream in)
Pulls the graph (including
PositionLengthAttribute ) from the provided TokenStream , and creates the corresponding
automaton where arcs are bytes (or Unicode code points
if unicodeArcs = true) from each term. |
Modifier and Type | Field and Description |
---|---|
protected Automaton |
AutomatonQuery.automaton
the automaton to match index terms against
|
Modifier and Type | Method and Description |
---|---|
Automaton |
AutomatonQuery.getAutomaton()
Returns the automaton used to create this query
|
static Automaton |
PrefixQuery.toAutomaton(BytesRef prefix)
Build an automaton accepting all terms with the specified prefix.
|
static Automaton |
TermRangeQuery.toAutomaton(BytesRef lowerTerm,
BytesRef upperTerm,
boolean includeLower,
boolean includeUpper) |
static Automaton |
WildcardQuery.toAutomaton(Term wildcardquery)
Convert Lucene wildcard syntax into an automaton.
|
Constructor and Description |
---|
AutomatonQuery(Term term,
Automaton automaton)
Create a new AutomatonQuery from an
Automaton . |
AutomatonQuery(Term term,
Automaton automaton,
int maxDeterminizedStates)
Create a new AutomatonQuery from an
Automaton . |
AutomatonQuery(Term term,
Automaton automaton,
int maxDeterminizedStates,
boolean isBinary)
Create a new AutomatonQuery from an
Automaton . |
Modifier and Type | Field and Description |
---|---|
Automaton |
CompiledAutomaton.automaton
Two dimensional array of transitions, indexed by state
number for traversal.
|
Modifier and Type | Method and Description |
---|---|
static Automaton |
DaciukMihovAutomatonBuilder.build(Collection<BytesRef> input)
Build a minimal, deterministic automaton from a sorted list of
BytesRef representing
strings in UTF-8. |
static Automaton |
Operations.complement(Automaton a,
int maxDeterminizedStates)
Returns a (deterministic) automaton that accepts the complement of the
language of the given automaton.
|
static Automaton |
Operations.concatenate(Automaton a1,
Automaton a2)
Returns an automaton that accepts the concatenation of the languages of the
given automata.
|
static Automaton |
Operations.concatenate(List<Automaton> l)
Returns an automaton that accepts the concatenation of the languages of the
given automata.
|
Automaton |
UTF32ToUTF8.convert(Automaton utf32)
Converts an incoming utf32 automaton to an equivalent
utf8 one.
|
static Automaton |
Operations.determinize(Automaton a,
int maxDeterminizedStates)
Determinizes the given automaton.
|
Automaton |
Automaton.Builder.finish()
Compiles all added states and transitions into a new
Automaton
and returns it. |
Automaton |
TooComplexToDeterminizeException.getAutomaton()
Returns the automaton that caused this exception, if any.
|
Automaton |
AutomatonProvider.getAutomaton(String name)
Returns automaton of the given name.
|
static Automaton |
Operations.intersection(Automaton a1,
Automaton a2)
Returns an automaton that accepts the intersection of the languages of the
given automata.
|
static Automaton |
Automata.makeAnyBinary()
Returns a new (deterministic) automaton that accepts all binary terms.
|
static Automaton |
Automata.makeAnyChar()
Returns a new (deterministic) automaton that accepts any single codepoint.
|
static Automaton |
Automata.makeAnyString()
Returns a new (deterministic) automaton that accepts all strings.
|
static Automaton |
Automata.makeBinary(BytesRef term)
Returns a new (deterministic) automaton that accepts the single given
binary term.
|
static Automaton |
Automata.makeBinaryInterval(BytesRef min,
boolean minInclusive,
BytesRef max,
boolean maxInclusive)
Creates a new deterministic, minimal automaton accepting
all binary terms in the specified interval.
|
static Automaton |
Automata.makeChar(int c)
Returns a new (deterministic) automaton that accepts a single codepoint of
the given value.
|
static Automaton |
Automata.makeCharRange(int min,
int max)
Returns a new (deterministic) automaton that accepts a single codepoint whose
value is in the given interval (including both end points).
|
static Automaton |
Automata.makeDecimalInterval(int min,
int max,
int digits)
Returns a new automaton that accepts strings representing decimal (base 10)
non-negative integers in the given interval.
|
static Automaton |
Automata.makeEmpty()
Returns a new (deterministic) automaton with the empty language.
|
static Automaton |
Automata.makeEmptyString()
Returns a new (deterministic) automaton that accepts only the empty string.
|
static Automaton |
Automata.makeString(int[] word,
int offset,
int length)
Returns a new (deterministic) automaton that accepts the single given
string from the specified unicode code points.
|
static Automaton |
Automata.makeString(String s)
Returns a new (deterministic) automaton that accepts the single given
string.
|
static Automaton |
Automata.makeStringUnion(Collection<BytesRef> utf8Strings)
Returns a new (deterministic and minimal) automaton that accepts the union
of the given collection of
BytesRef s representing UTF-8 encoded
strings. |
static Automaton |
MinimizationOperations.minimize(Automaton a,
int maxDeterminizedStates)
Minimizes (and determinizes if not already deterministic) the given
automaton using Hopcroft's algorighm.
|
static Automaton |
Operations.minus(Automaton a1,
Automaton a2,
int maxDeterminizedStates)
Returns a (deterministic) automaton that accepts the intersection of the
language of
a1 and the complement of the language of
a2 . |
static Automaton |
Operations.optional(Automaton a)
Returns an automaton that accepts the union of the empty string and the
language of the given automaton.
|
static Automaton |
Operations.removeDeadStates(Automaton a)
Removes transitions to dead states (a state is "dead" if it is not
reachable from the initial state or no accept state is reachable from it.)
|
static Automaton |
Operations.repeat(Automaton a)
Returns an automaton that accepts the Kleene star (zero or more
concatenated repetitions) of the language of the given automaton.
|
static Automaton |
Operations.repeat(Automaton a,
int count)
Returns an automaton that accepts
min or more concatenated
repetitions of the language of the given automaton. |
static Automaton |
Operations.repeat(Automaton a,
int min,
int max)
Returns an automaton that accepts between
min and
max (including both) concatenated repetitions of the language
of the given automaton. |
static Automaton |
Operations.reverse(Automaton a)
Returns an automaton accepting the reverse language.
|
Automaton |
RegExp.toAutomaton()
Constructs new
Automaton from this RegExp . |
Automaton |
RegExp.toAutomaton(AutomatonProvider automaton_provider,
int maxDeterminizedStates)
Constructs new
Automaton from this RegExp . |
Automaton |
LevenshteinAutomata.toAutomaton(int n)
Compute a DFA that accepts all strings within an edit distance of
n . |
Automaton |
RegExp.toAutomaton(int maxDeterminizedStates)
Constructs new
Automaton from this RegExp . |
Automaton |
LevenshteinAutomata.toAutomaton(int n,
String prefix)
Compute a DFA that accepts all strings within an edit distance of
n ,
matching the specified exact prefix. |
Automaton |
RegExp.toAutomaton(Map<String,Automaton> automata,
int maxDeterminizedStates)
Constructs new
Automaton from this RegExp . |
static Automaton |
Operations.union(Automaton a1,
Automaton a2)
Returns an automaton that accepts the union of the languages of the given
automata.
|
static Automaton |
Operations.union(Collection<Automaton> l)
Returns an automaton that accepts the union of the languages of the given
automata.
|
Modifier and Type | Method and Description |
---|---|
static int |
Automata.appendAnyChar(Automaton a,
int state)
Accept any single character starting from the specified state, returning the new state
|
static int |
Automata.appendChar(Automaton a,
int state,
int c)
Appends the specified character to the specified state, returning a new state.
|
static Automaton |
Operations.complement(Automaton a,
int maxDeterminizedStates)
Returns a (deterministic) automaton that accepts the complement of the
language of the given automaton.
|
static Automaton |
Operations.concatenate(Automaton a1,
Automaton a2)
Returns an automaton that accepts the concatenation of the languages of the
given automata.
|
Automaton |
UTF32ToUTF8.convert(Automaton utf32)
Converts an incoming utf32 automaton to an equivalent
utf8 one.
|
void |
Automaton.copy(Automaton other)
Copies over all states/transitions from other.
|
void |
Automaton.Builder.copy(Automaton other)
Copies over all states/transitions from other.
|
void |
Automaton.Builder.copyStates(Automaton other)
Copies over all states from other.
|
static Automaton |
Operations.determinize(Automaton a,
int maxDeterminizedStates)
Determinizes the given automaton.
|
static String |
Operations.getCommonPrefix(Automaton a)
Returns the longest string that is a prefix of all accepted strings and
visits each state at most once.
|
static BytesRef |
Operations.getCommonPrefixBytesRef(Automaton a)
Returns the longest BytesRef that is a prefix of all accepted strings and
visits each state at most once.
|
static BytesRef |
Operations.getCommonSuffixBytesRef(Automaton a,
int maxDeterminizedStates)
Returns the longest BytesRef that is a suffix of all accepted strings.
|
static IntsRef |
Operations.getSingleton(Automaton a)
If this automaton accepts a single input, return it.
|
static boolean |
Operations.hasDeadStates(Automaton a)
Returns true if this automaton has any states that cannot
be reached from the initial state or cannot reach an accept state.
|
static boolean |
Operations.hasDeadStatesFromInitial(Automaton a)
Returns true if there are dead states reachable from an initial state.
|
static boolean |
Operations.hasDeadStatesToAccept(Automaton a)
Returns true if there are dead states that reach an accept state.
|
static Automaton |
Operations.intersection(Automaton a1,
Automaton a2)
Returns an automaton that accepts the intersection of the languages of the
given automata.
|
static boolean |
Operations.isEmpty(Automaton a)
Returns true if the given automaton accepts no strings.
|
static boolean |
Operations.isFinite(Automaton a)
Returns true if the language of this automaton is finite.
|
static boolean |
Operations.isTotal(Automaton a)
Returns true if the given automaton accepts all strings.
|
static boolean |
Operations.isTotal(Automaton a,
int minAlphabet,
int maxAlphabet)
Returns true if the given automaton accepts all strings for the specified min/max
range of the alphabet.
|
static Automaton |
MinimizationOperations.minimize(Automaton a,
int maxDeterminizedStates)
Minimizes (and determinizes if not already deterministic) the given
automaton using Hopcroft's algorighm.
|
static Automaton |
Operations.minus(Automaton a1,
Automaton a2,
int maxDeterminizedStates)
Returns a (deterministic) automaton that accepts the intersection of the
language of
a1 and the complement of the language of
a2 . |
static Automaton |
Operations.optional(Automaton a)
Returns an automaton that accepts the union of the empty string and the
language of the given automaton.
|
static Automaton |
Operations.removeDeadStates(Automaton a)
Removes transitions to dead states (a state is "dead" if it is not
reachable from the initial state or no accept state is reachable from it.)
|
static Automaton |
Operations.repeat(Automaton a)
Returns an automaton that accepts the Kleene star (zero or more
concatenated repetitions) of the language of the given automaton.
|
static Automaton |
Operations.repeat(Automaton a,
int count)
Returns an automaton that accepts
min or more concatenated
repetitions of the language of the given automaton. |
static Automaton |
Operations.repeat(Automaton a,
int min,
int max)
Returns an automaton that accepts between
min and
max (including both) concatenated repetitions of the language
of the given automaton. |
static Automaton |
Operations.reverse(Automaton a)
Returns an automaton accepting the reverse language.
|
static boolean |
Operations.run(Automaton a,
IntsRef s)
Returns true if the given string (expressed as unicode codepoints) is accepted by the automaton.
|
static boolean |
Operations.run(Automaton a,
String s)
Returns true if the given string is accepted by the automaton.
|
static boolean |
Operations.sameLanguage(Automaton a1,
Automaton a2)
Returns true if these two automata accept exactly the
same language.
|
static boolean |
Operations.subsetOf(Automaton a1,
Automaton a2)
Returns true if the language of
a1 is a subset of the language
of a2 . |
static int[] |
Operations.topoSortStates(Automaton a)
Returns the topological sort of all states reachable from
the initial state.
|
static Automaton |
Operations.union(Automaton a1,
Automaton a2)
Returns an automaton that accepts the union of the languages of the given
automata.
|
Modifier and Type | Method and Description |
---|---|
static Automaton |
Operations.concatenate(List<Automaton> l)
Returns an automaton that accepts the concatenation of the languages of the
given automata.
|
Automaton |
RegExp.toAutomaton(Map<String,Automaton> automata,
int maxDeterminizedStates)
Constructs new
Automaton from this RegExp . |
static Automaton |
Operations.union(Collection<Automaton> l)
Returns an automaton that accepts the union of the languages of the given
automata.
|
Constructor and Description |
---|
ByteRunAutomaton(Automaton a)
Converts incoming automaton to byte-based (UTF32ToUTF8) first
|
ByteRunAutomaton(Automaton a,
boolean isBinary,
int maxDeterminizedStates)
expert: if isBinary is true, the input is already byte-based
|
CharacterRunAutomaton(Automaton a)
Construct with a default number of maxDeterminizedStates.
|
CharacterRunAutomaton(Automaton a,
int maxDeterminizedStates)
Construct specifying maxDeterminizedStates.
|
CompiledAutomaton(Automaton automaton)
Create this, passing simplify=true and finite=null, so that we try
to simplify the automaton and determine if it is finite.
|
CompiledAutomaton(Automaton automaton,
Boolean finite,
boolean simplify)
Create this.
|
CompiledAutomaton(Automaton automaton,
Boolean finite,
boolean simplify,
int maxDeterminizedStates,
boolean isBinary)
Create this.
|
FiniteStringsIterator(Automaton a)
Constructor.
|
FiniteStringsIterator(Automaton a,
int startState,
int endState)
Constructor.
|
LimitedFiniteStringsIterator(Automaton a,
int limit)
Constructor.
|
RunAutomaton(Automaton a,
int alphabetSize)
Constructs a new
RunAutomaton from a deterministic
Automaton . |
RunAutomaton(Automaton a,
int alphabetSize,
int maxDeterminizedStates)
Constructs a new
RunAutomaton from a deterministic
Automaton . |
TooComplexToDeterminizeException(Automaton automaton,
int maxDeterminizedStates)
Use this constructor when the automaton failed to determinize.
|
Copyright © 2000-2017 Apache Software Foundation. All Rights Reserved.