Class RegexpQuery

  • All Implemented Interfaces:
    Accountable

    public class RegexpQuery
    extends AutomatonQuery
    A fast regular expression query based on the org.apache.lucene.util.automaton package.
    • Comparisons are fast
    • The term dictionary is enumerated in an intelligent way, to avoid comparisons. See AutomatonQuery for more details.

    The supported syntax is documented in the RegExp class. Note this might be different than other regular expression implementations. For some alternatives with different syntax, look under the sandbox.

    Note this query can be slow, as it needs to iterate over many terms. In order to prevent extremely slow RegexpQueries, a Regexp term should not start with the expression .*

    See Also:
    RegExp
    WARNING: This API is experimental and might change in incompatible ways in the next release.
    • Field Detail

      • DEFAULT_PROVIDER

        public static final AutomatonProvider DEFAULT_PROVIDER
        A provider that provides no named automata
    • Constructor Detail

      • RegexpQuery

        public RegexpQuery​(Term term)
        Constructs a query for terms matching term.

        By default, all regular expression features are enabled.

        Parameters:
        term - regular expression.
      • RegexpQuery

        public RegexpQuery​(Term term,
                           int flags)
        Constructs a query for terms matching term.
        Parameters:
        term - regular expression.
        flags - optional RegExp features from RegExp
      • RegexpQuery

        public RegexpQuery​(Term term,
                           int flags,
                           int determinizeWorkLimit)
        Constructs a query for terms matching term.
        Parameters:
        term - regular expression.
        flags - optional RegExp syntax features from RegExp
        determinizeWorkLimit - maximum effort to spend while compiling the automaton from this regexp. Set higher to allow more complex queries and lower to prevent memory exhaustion. Use Operations.DEFAULT_DETERMINIZE_WORK_LIMIT as a decent default if you don't otherwise know what to specify.
      • RegexpQuery

        public RegexpQuery​(Term term,
                           int syntax_flags,
                           int match_flags,
                           int determinizeWorkLimit)
        Constructs a query for terms matching term.
        Parameters:
        term - regular expression.
        syntax_flags - optional RegExp syntax features from RegExp automaton for the regexp can result in. Set higher to allow more complex queries and lower to prevent memory exhaustion.
        match_flags - boolean 'or' of match behavior options such as case insensitivity
        determinizeWorkLimit - maximum effort to spend while compiling the automaton from this regexp. Set higher to allow more complex queries and lower to prevent memory exhaustion. Use Operations.DEFAULT_DETERMINIZE_WORK_LIMIT as a decent default if you don't otherwise know what to specify.
      • RegexpQuery

        public RegexpQuery​(Term term,
                           int syntax_flags,
                           AutomatonProvider provider,
                           int determinizeWorkLimit)
        Constructs a query for terms matching term.
        Parameters:
        term - regular expression.
        syntax_flags - optional RegExp features from RegExp
        provider - custom AutomatonProvider for named automata
        determinizeWorkLimit - maximum effort to spend while compiling the automaton from this regexp. Set higher to allow more complex queries and lower to prevent memory exhaustion. Use Operations.DEFAULT_DETERMINIZE_WORK_LIMIT as a decent default if you don't otherwise know what to specify.
      • RegexpQuery

        public RegexpQuery​(Term term,
                           int syntax_flags,
                           int match_flags,
                           AutomatonProvider provider,
                           int determinizeWorkLimit,
                           MultiTermQuery.RewriteMethod rewriteMethod)
        Constructs a query for terms matching term.
        Parameters:
        term - regular expression.
        syntax_flags - optional RegExp features from RegExp
        match_flags - boolean 'or' of match behavior options such as case insensitivity
        provider - custom AutomatonProvider for named automata
        determinizeWorkLimit - maximum effort to spend while compiling the automaton from this regexp. Set higher to allow more complex queries and lower to prevent memory exhaustion. Use Operations.DEFAULT_DETERMINIZE_WORK_LIMIT as a decent default if you don't otherwise know what to specify.
        rewriteMethod - the rewrite method to use to build the final query
    • Method Detail

      • getRegexp

        public Term getRegexp()
        Returns the regexp of this query wrapped in a Term.