Class RegexpQuery

All Implemented Interfaces:
Accountable

public class RegexpQuery extends AutomatonQuery
A fast regular expression query based on the org.apache.lucene.util.automaton package.
  • Comparisons are fast
  • The term dictionary is enumerated in an intelligent way, to avoid comparisons. See AutomatonQuery for more details.

The supported syntax is documented in the RegExp class. Note this might be different than other regular expression implementations. For some alternatives with different syntax, look under the sandbox.

Note this query can be slow, as it needs to iterate over many terms. In order to prevent extremely slow RegexpQueries, a Regexp term should not start with the expression .*

See Also:
WARNING: This API is experimental and might change in incompatible ways in the next release.
  • Field Details

    • DEFAULT_PROVIDER

      public static final AutomatonProvider DEFAULT_PROVIDER
      A provider that provides no named automata
  • Constructor Details

    • RegexpQuery

      public RegexpQuery(Term term)
      Constructs a query for terms matching term.

      By default, all regular expression features are enabled.

      Parameters:
      term - regular expression.
    • RegexpQuery

      public RegexpQuery(Term term, int flags)
      Constructs a query for terms matching term.
      Parameters:
      term - regular expression.
      flags - optional RegExp features from RegExp
    • RegexpQuery

      public RegexpQuery(Term term, int flags, int determinizeWorkLimit)
      Constructs a query for terms matching term.
      Parameters:
      term - regular expression.
      flags - optional RegExp syntax features from RegExp
      determinizeWorkLimit - maximum effort to spend while compiling the automaton from this regexp. Set higher to allow more complex queries and lower to prevent memory exhaustion. Use Operations.DEFAULT_DETERMINIZE_WORK_LIMIT as a decent default if you don't otherwise know what to specify.
    • RegexpQuery

      public RegexpQuery(Term term, int syntax_flags, int match_flags, int determinizeWorkLimit)
      Constructs a query for terms matching term.
      Parameters:
      term - regular expression.
      syntax_flags - optional RegExp syntax features from RegExp automaton for the regexp can result in. Set higher to allow more complex queries and lower to prevent memory exhaustion.
      match_flags - boolean 'or' of match behavior options such as case insensitivity
      determinizeWorkLimit - maximum effort to spend while compiling the automaton from this regexp. Set higher to allow more complex queries and lower to prevent memory exhaustion. Use Operations.DEFAULT_DETERMINIZE_WORK_LIMIT as a decent default if you don't otherwise know what to specify.
    • RegexpQuery

      public RegexpQuery(Term term, int syntax_flags, AutomatonProvider provider, int determinizeWorkLimit)
      Constructs a query for terms matching term.
      Parameters:
      term - regular expression.
      syntax_flags - optional RegExp features from RegExp
      provider - custom AutomatonProvider for named automata
      determinizeWorkLimit - maximum effort to spend while compiling the automaton from this regexp. Set higher to allow more complex queries and lower to prevent memory exhaustion. Use Operations.DEFAULT_DETERMINIZE_WORK_LIMIT as a decent default if you don't otherwise know what to specify.
    • RegexpQuery

      public RegexpQuery(Term term, int syntax_flags, int match_flags, AutomatonProvider provider, int determinizeWorkLimit, MultiTermQuery.RewriteMethod rewriteMethod)
      Constructs a query for terms matching term.
      Parameters:
      term - regular expression.
      syntax_flags - optional RegExp features from RegExp
      match_flags - boolean 'or' of match behavior options such as case insensitivity
      provider - custom AutomatonProvider for named automata
      determinizeWorkLimit - maximum effort to spend while compiling the automaton from this regexp. Set higher to allow more complex queries and lower to prevent memory exhaustion. Use Operations.DEFAULT_DETERMINIZE_WORK_LIMIT as a decent default if you don't otherwise know what to specify.
      rewriteMethod - the rewrite method to use to build the final query
  • Method Details

    • getRegexp

      public Term getRegexp()
      Returns the regexp of this query wrapped in a Term.
    • toString

      public String toString(String field)
      Prints a user-readable version of this query.
      Overrides:
      toString in class AutomatonQuery