public final class Operations extends Object
Modifier and Type | Field and Description |
---|---|
static int |
DEFAULT_DETERMINIZE_WORK_LIMIT
Default maximum effort that
determinize(org.apache.lucene.util.automaton.Automaton, int) should spend before giving up and
throwing TooComplexToDeterminizeException . |
static int |
MAX_RECURSION_LEVEL
Maximum level of recursion allowed in recursive operations.
|
Modifier and Type | Method and Description |
---|---|
static Automaton |
complement(Automaton a,
int determinizeWorkLimit)
Returns a (deterministic) automaton that accepts the complement of the
language of the given automaton.
|
static Automaton |
concatenate(Automaton a1,
Automaton a2)
Returns an automaton that accepts the concatenation of the languages of the
given automata.
|
static Automaton |
concatenate(List<Automaton> l)
Returns an automaton that accepts the concatenation of the languages of the
given automata.
|
static Automaton |
determinize(Automaton a,
int workLimit)
Determinizes the given automaton.
|
static String |
getCommonPrefix(Automaton a)
Returns the longest string that is a prefix of all accepted strings and
visits each state at most once.
|
static BytesRef |
getCommonPrefixBytesRef(Automaton a)
Returns the longest BytesRef that is a prefix of all accepted strings and
visits each state at most once.
|
static BytesRef |
getCommonSuffixBytesRef(Automaton a)
Returns the longest BytesRef that is a suffix of all accepted strings.
|
static IntsRef |
getSingleton(Automaton a)
If this automaton accepts a single input, return it.
|
static boolean |
hasDeadStates(Automaton a)
Returns true if this automaton has any states that cannot
be reached from the initial state or cannot reach an accept state.
|
static boolean |
hasDeadStatesFromInitial(Automaton a)
Returns true if there are dead states reachable from an initial state.
|
static boolean |
hasDeadStatesToAccept(Automaton a)
Returns true if there are dead states that reach an accept state.
|
static Automaton |
intersection(Automaton a1,
Automaton a2)
Returns an automaton that accepts the intersection of the languages of the
given automata.
|
static boolean |
isEmpty(Automaton a)
Returns true if the given automaton accepts no strings.
|
static boolean |
isFinite(Automaton a)
Returns true if the language of this automaton is finite.
|
static boolean |
isTotal(Automaton a)
Returns true if the given automaton accepts all strings.
|
static boolean |
isTotal(Automaton a,
int minAlphabet,
int maxAlphabet)
Returns true if the given automaton accepts all strings for the specified min/max
range of the alphabet.
|
static Automaton |
minus(Automaton a1,
Automaton a2,
int determinizeWorkLimit)
Returns a (deterministic) automaton that accepts the intersection of the
language of
a1 and the complement of the language of
a2 . |
static Automaton |
optional(Automaton a)
Returns an automaton that accepts the union of the empty string and the
language of the given automaton.
|
static Automaton |
removeDeadStates(Automaton a)
Removes transitions to dead states (a state is "dead" if it is not
reachable from the initial state or no accept state is reachable from it.)
|
static Automaton |
repeat(Automaton a)
Returns an automaton that accepts the Kleene star (zero or more
concatenated repetitions) of the language of the given automaton.
|
static Automaton |
repeat(Automaton a,
int count)
Returns an automaton that accepts
min or more concatenated
repetitions of the language of the given automaton. |
static Automaton |
repeat(Automaton a,
int min,
int max)
Returns an automaton that accepts between
min and
max (including both) concatenated repetitions of the language
of the given automaton. |
static Automaton |
reverse(Automaton a)
Returns an automaton accepting the reverse language.
|
static boolean |
run(Automaton a,
IntsRef s)
Returns true if the given string (expressed as unicode codepoints) is accepted by the automaton.
|
static boolean |
run(Automaton a,
String s)
Returns true if the given string is accepted by the automaton.
|
static boolean |
sameLanguage(Automaton a1,
Automaton a2)
Returns true if these two automata accept exactly the
same language.
|
static boolean |
subsetOf(Automaton a1,
Automaton a2)
Returns true if the language of
a1 is a subset of the language
of a2 . |
static int[] |
topoSortStates(Automaton a)
Returns the topological sort of all states reachable from
the initial state.
|
static Automaton |
union(Automaton a1,
Automaton a2)
Returns an automaton that accepts the union of the languages of the given
automata.
|
static Automaton |
union(Collection<Automaton> l)
Returns an automaton that accepts the union of the languages of the given
automata.
|
public static final int DEFAULT_DETERMINIZE_WORK_LIMIT
determinize(org.apache.lucene.util.automaton.Automaton, int)
should spend before giving up and
throwing TooComplexToDeterminizeException
.public static final int MAX_RECURSION_LEVEL
public static Automaton concatenate(Automaton a1, Automaton a2)
Complexity: linear in total number of states.
public static Automaton concatenate(List<Automaton> l)
Complexity: linear in total number of states.
public static Automaton optional(Automaton a)
Complexity: linear in number of states.
public static Automaton repeat(Automaton a)
Complexity: linear in number of states.
public static Automaton repeat(Automaton a, int count)
min
or more concatenated
repetitions of the language of the given automaton.
Complexity: linear in number of states and in min
.
public static Automaton repeat(Automaton a, int min, int max)
min
and
max
(including both) concatenated repetitions of the language
of the given automaton.
Complexity: linear in number of states and in min
and
max
.
public static Automaton complement(Automaton a, int determinizeWorkLimit)
Complexity: linear in number of states if already deterministic and exponential otherwise.
determinizeWorkLimit
- maximum effort to spend determinizing the automaton. Set higher to
allow more complex queries and lower to prevent memory exhaustion. DEFAULT_DETERMINIZE_WORK_LIMIT
is a good starting default.public static Automaton minus(Automaton a1, Automaton a2, int determinizeWorkLimit)
a1
and the complement of the language of
a2
. As a side-effect, the automata may be determinized, if not
already deterministic.
Complexity: quadratic in number of states if a2 already deterministic and exponential in number of a2's states otherwise.
a1
- the initial automatona2
- the automaton to subtractdeterminizeWorkLimit
- maximum effort to spend determinizing the automaton. Set higher to
allow more complex queries and lower to prevent memory exhaustion. DEFAULT_DETERMINIZE_WORK_LIMIT
is a good starting default.public static Automaton intersection(Automaton a1, Automaton a2)
Complexity: quadratic in number of states.
public static boolean sameLanguage(Automaton a1, Automaton a2)
public static boolean hasDeadStates(Automaton a)
public static boolean hasDeadStatesFromInitial(Automaton a)
public static boolean hasDeadStatesToAccept(Automaton a)
public static boolean subsetOf(Automaton a1, Automaton a2)
a1
is a subset of the language
of a2
. Both automata must be determinized and must have no dead
states.
Complexity: quadratic in number of states.
public static Automaton union(Automaton a1, Automaton a2)
Complexity: linear in number of states.
public static Automaton union(Collection<Automaton> l)
Complexity: linear in number of states.
public static Automaton determinize(Automaton a, int workLimit)
Worst case complexity: exponential in number of states.
workLimit
- Maximum amount of "work" that the powerset construction will spend before
throwing TooComplexToDeterminizeException
. Higher numbers allow this operation to
consume more memory and CPU but allow more complex automatons. Use DEFAULT_DETERMINIZE_WORK_LIMIT
as a decent default if you don't otherwise know what to
specify.TooComplexToDeterminizeException
- if determinizing requires more than workLimit
"effort"public static boolean isEmpty(Automaton a)
public static boolean isTotal(Automaton a)
public static boolean isTotal(Automaton a, int minAlphabet, int maxAlphabet)
public static boolean run(Automaton a, String s)
Complexity: linear in the length of the string.
Note: for full performance, use the RunAutomaton
class.
public static boolean run(Automaton a, IntsRef s)
Complexity: linear in the length of the string.
Note: for full performance, use the RunAutomaton
class.
public static Automaton removeDeadStates(Automaton a)
public static boolean isFinite(Automaton a)
public static String getCommonPrefix(Automaton a)
UTF32ToUTF8
) then you should use getCommonPrefixBytesRef(org.apache.lucene.util.automaton.Automaton)
instead.IllegalArgumentException
- if the automaton has dead states reachable from the initial
state.public static BytesRef getCommonPrefixBytesRef(Automaton a)
public static IntsRef getSingleton(Automaton a)
public static BytesRef getCommonSuffixBytesRef(Automaton a)
public static Automaton reverse(Automaton a)
public static int[] topoSortStates(Automaton a)
Copyright © 2000-2024 Apache Software Foundation. All Rights Reserved.