public class RegExp extends Object
Automaton.
 Regular expressions are built from the following abstract syntax:
| regexp | ::= | unionexp | ||
| | | ||||
| unionexp | ::= | interexp | unionexp | (union) | |
| | | interexp | |||
| interexp | ::= | concatexp & interexp | (intersection) | [OPTIONAL] | 
| | | concatexp | |||
| concatexp | ::= | repeatexp concatexp | (concatenation) | |
| | | repeatexp | |||
| repeatexp | ::= | repeatexp ? | (zero or one occurrence) | |
| | | repeatexp * | (zero or more occurrences) | ||
| | | repeatexp + | (one or more occurrences) | ||
| | | repeatexp {n} | (n occurrences) | ||
| | | repeatexp {n,} | (n or more occurrences) | ||
| | | repeatexp {n,m} | (n to m occurrences, including both) | ||
| | | complexp | |||
| complexp | ::= | ~ complexp | (complement) | [OPTIONAL] | 
| | | charclassexp | |||
| charclassexp | ::= | [ charclasses ] | (character class) | |
| | | [^ charclasses ] | (negated character class) | ||
| | | simpleexp | |||
| charclasses | ::= | charclass charclasses | ||
| | | charclass | |||
| charclass | ::= | charexp - charexp | (character range, including end-points) | |
| | | charexp | |||
| simpleexp | ::= | charexp | ||
| | | . | (any single character) | ||
| | | # | (the empty language) | [OPTIONAL] | |
| | | @ | (any string) | [OPTIONAL] | |
| | | " <Unicode string without double-quotes> " | (a string) | ||
| | | ( ) | (the empty string) | ||
| | | ( unionexp ) | (precedence override) | ||
| | | < <identifier> > | (named automaton) | [OPTIONAL] | |
| | | <n-m> | (numerical interval) | [OPTIONAL] | |
| charexp | ::= | <Unicode character> | (a single non-reserved character) | |
| | | \ <Unicode character> | (a single character) | 
 The productions marked [OPTIONAL] are only allowed if
 specified by the syntax flags passed to the RegExp constructor.
 The reserved characters used in the (enabled) syntax must be escaped with
 backslash (\) or double-quotes ("..."). (In
 contrast to other regexp syntaxes, this is required also in character
 classes.) Be aware that dash (-) has a special meaning in
 charclass expressions. An identifier is a string not containing right
 angle bracket (>) or dash (-). Numerical
 intervals are specified by non-negative decimal integers and include both end
 points, and if n and m have the same number
 of digits, then the conforming strings must have that length (i.e. prefixed
 by 0's).
| Modifier and Type | Class and Description | 
|---|---|
| static class  | RegExp.KindThe type of expression represented by a RegExp node. | 
| Modifier and Type | Field and Description | 
|---|---|
| static int | ALLSyntax flag, enables all optional regexp syntax. | 
| static int | ANYSTRINGSyntax flag, enables anystring (@). | 
| static int | ASCII_CASE_INSENSITIVEAllows case insensitive matching of ASCII characters. | 
| static int | AUTOMATONSyntax flag, enables named automata (<identifier>). | 
| int | cCharacter expression | 
| static int | COMPLEMENTSyntax flag, enables complement (~). | 
| int | digitsLimits for repeatable type expressions | 
| static int | EMPTYSyntax flag, enables empty language (#). | 
| RegExp | exp1Child expressions held by a container type expression | 
| RegExp | exp2Child expressions held by a container type expression | 
| int | fromExtents for range type expressions | 
| static int | INTERSECTIONSyntax flag, enables intersection (&). | 
| static int | INTERVALSyntax flag, enables numerical intervals (
 <n-m>). | 
| RegExp.Kind | kindThe type of expression | 
| int | maxLimits for repeatable type expressions | 
| int | minLimits for repeatable type expressions | 
| static int | NONESyntax flag, enables no optional regexp syntax. | 
| String | sString expression | 
| int | toExtents for range type expressions | 
| Constructor and Description | 
|---|
| RegExp(String s)Constructs new  RegExpfrom a string. | 
| RegExp(String s,
      int syntax_flags)Constructs new  RegExpfrom a string. | 
| RegExp(String s,
      int syntax_flags,
      int match_flags)Constructs new  RegExpfrom a string. | 
| Modifier and Type | Method and Description | 
|---|---|
| Set<String> | getIdentifiers()Returns set of automaton identifiers that occur in this regular expression. | 
| String | getOriginalString()The string that was used to construct the regex. | 
| Automaton | toAutomaton()Constructs new  Automatonfrom thisRegExp. | 
| Automaton | toAutomaton(AutomatonProvider automaton_provider,
           int maxDeterminizedStates)Constructs new  Automatonfrom thisRegExp. | 
| Automaton | toAutomaton(int maxDeterminizedStates)Constructs new  Automatonfrom thisRegExp. | 
| Automaton | toAutomaton(Map<String,Automaton> automata,
           int maxDeterminizedStates)Constructs new  Automatonfrom thisRegExp. | 
| String | toString()Constructs string from parsed regular expression. | 
| String | toStringTree()Like to string, but more verbose (shows the higherchy more clearly). | 
public static final int INTERSECTION
public static final int COMPLEMENT
public static final int EMPTY
public static final int ANYSTRING
public static final int AUTOMATON
public static final int INTERVAL
public static final int ALL
public static final int NONE
public static final int ASCII_CASE_INSENSITIVE
public final RegExp.Kind kind
public final RegExp exp1
public final RegExp exp2
public final String s
public final int c
public final int min
public final int max
public final int digits
public final int from
public final int to
public RegExp(String s) throws IllegalArgumentException
RegExp from a string. Same as
 RegExp(s, ALL).s - regexp stringIllegalArgumentException - if an error occurred while parsing the
              regular expressionpublic RegExp(String s, int syntax_flags) throws IllegalArgumentException
RegExp from a string.s - regexp stringsyntax_flags - boolean 'or' of optional syntax constructs to be
          enabledIllegalArgumentException - if an error occurred while parsing the
              regular expressionpublic RegExp(String s, int syntax_flags, int match_flags) throws IllegalArgumentException
RegExp from a string.s - regexp stringsyntax_flags - boolean 'or' of optional syntax constructs to be
          enabledmatch_flags - boolean 'or' of match behavior options such as case insensitivityIllegalArgumentException - if an error occurred while parsing the
              regular expressionpublic Automaton toAutomaton()
Automaton from this RegExp. Same
 as toAutomaton(null) (empty automaton map).public Automaton toAutomaton(int maxDeterminizedStates) throws IllegalArgumentException, TooComplexToDeterminizeException
Automaton from this RegExp. The
 constructed automaton is minimal and deterministic and has no transitions
 to dead states.maxDeterminizedStates - maximum number of states in the resulting
   automata.  If the automata would need more than this many states
   TooComplextToDeterminizeException is thrown.  Higher number require more
   space but can process more complex regexes.IllegalArgumentException - if this regular expression uses a named
              identifier that is not available from the automaton providerTooComplexToDeterminizeException - if determinizing this regexp
   requires more than maxDeterminizedStates statespublic Automaton toAutomaton(AutomatonProvider automaton_provider, int maxDeterminizedStates) throws IllegalArgumentException, TooComplexToDeterminizeException
Automaton from this RegExp. The
 constructed automaton is minimal and deterministic and has no transitions
 to dead states.automaton_provider - provider of automata for named identifiersmaxDeterminizedStates - maximum number of states in the resulting
   automata.  If the automata would need more than this many states
   TooComplextToDeterminizeException is thrown.  Higher number require more
   space but can process more complex regexes.IllegalArgumentException - if this regular expression uses a named
   identifier that is not available from the automaton providerTooComplexToDeterminizeException - if determinizing this regexp
   requires more than maxDeterminizedStates statespublic Automaton toAutomaton(Map<String,Automaton> automata, int maxDeterminizedStates) throws IllegalArgumentException, TooComplexToDeterminizeException
Automaton from this RegExp. The
 constructed automaton is minimal and deterministic and has no transitions
 to dead states.automata - a map from automaton identifiers to automata (of type
          Automaton).maxDeterminizedStates - maximum number of states in the resulting
   automata.  If the automata would need more than this many states
   TooComplexToDeterminizeException is thrown.  Higher number require more
   space but can process more complex regexes.IllegalArgumentException - if this regular expression uses a named
   identifier that does not occur in the automaton mapTooComplexToDeterminizeException - if determinizing this regexp
   requires more than maxDeterminizedStates statespublic String getOriginalString()
public String toString()
public String toStringTree()
Copyright © 2000-2021 Apache Software Foundation. All Rights Reserved.