Class AutomatonQuery

All Implemented Interfaces:
Accountable
Direct Known Subclasses:
PrefixQuery, RegexpQuery, TermRangeQuery, WildcardQuery

public class AutomatonQuery extends MultiTermQuery implements Accountable
A Query that will match terms against a finite-state machine.

This query will match documents that contain terms accepted by a given finite-state machine. The automaton can be constructed with the org.apache.lucene.util.automaton API. Alternatively, it can be created from a regular expression with RegexpQuery or from the standard Lucene wildcard syntax with WildcardQuery.

When the query is executed, it will create an equivalent DFA of the finite-state machine, and will enumerate the term dictionary in an intelligent way to reduce the number of comparisons. For example: the regular expression of [dl]og? will make approximately four comparisons: do, dog, lo, and log.

WARNING: This API is experimental and might change in incompatible ways in the next release.
  • Field Details

    • automaton

      protected final Automaton automaton
      the automaton to match index terms against
    • compiled

      protected final CompiledAutomaton compiled
    • term

      protected final Term term
      term containing the field, and possibly some pattern structure
    • automatonIsBinary

      protected final boolean automatonIsBinary
  • Constructor Details

    • AutomatonQuery

      public AutomatonQuery(Term term, Automaton automaton)
      Create a new AutomatonQuery from an Automaton.
      Parameters:
      term - Term containing field and possibly some pattern structure. The term text is ignored.
      automaton - Automaton to run, terms that are accepted are considered a match.
    • AutomatonQuery

      public AutomatonQuery(Term term, Automaton automaton, int determinizeWorkLimit)
      Create a new AutomatonQuery from an Automaton.
      Parameters:
      term - Term containing field and possibly some pattern structure. The term text is ignored.
      automaton - Automaton to run, terms that are accepted are considered a match.
      determinizeWorkLimit - maximum effort to spend determinizing the automaton. If the automaton would need more than this much effort, TooComplexToDeterminizeException is thrown. Higher numbers require more space but can process more complex automata.
    • AutomatonQuery

      public AutomatonQuery(Term term, Automaton automaton, int determinizeWorkLimit, boolean isBinary)
      Create a new AutomatonQuery from an Automaton.
      Parameters:
      term - Term containing field and possibly some pattern structure. The term text is ignored.
      automaton - Automaton to run, terms that are accepted are considered a match.
      determinizeWorkLimit - maximum effort to spend determinizing the automaton. If the automaton will need more than this much effort, TooComplexToDeterminizeException is thrown. Higher numbers require more space but can process more complex automata.
      isBinary - if true, this automaton is already binary and will not go through the UTF32ToUTF8 conversion
    • AutomatonQuery

      public AutomatonQuery(Term term, Automaton automaton, int determinizeWorkLimit, boolean isBinary, MultiTermQuery.RewriteMethod rewriteMethod)
      Create a new AutomatonQuery from an Automaton.
      Parameters:
      term - Term containing field and possibly some pattern structure. The term text is ignored.
      automaton - Automaton to run, terms that are accepted are considered a match.
      isBinary - if true, this automaton is already binary and will not go through the UTF32ToUTF8 conversion
      rewriteMethod - the rewriteMethod to use to build the final query from the automaton
  • Method Details

    • getTermsEnum

      protected TermsEnum getTermsEnum(Terms terms, AttributeSource atts) throws IOException
      Description copied from class: MultiTermQuery
      Construct the enumeration to be used, expanding the pattern term. This method should only be called if the field exists (ie, implementations can assume the field does exist). This method should not return null (should instead return TermsEnum.EMPTY if no terms match). The TermsEnum must already be positioned to the first matching term. The given AttributeSource is passed by the MultiTermQuery.RewriteMethod to share information between segments, for example TopTermsRewrite uses it to share maximum competitive boosts
      Specified by:
      getTermsEnum in class MultiTermQuery
      Throws:
      IOException
    • hashCode

      public int hashCode()
      Description copied from class: Query
      Override and implement query hash code properly in a subclass. This is required so that QueryCache works properly.
      Overrides:
      hashCode in class MultiTermQuery
      See Also:
    • equals

      public boolean equals(Object obj)
      Description copied from class: Query
      Override and implement query instance equivalence properly in a subclass. This is required so that QueryCache works properly.

      Typically a query will be equal to another only if it's an instance of the same class and its document-filtering properties are identical that other instance. Utility methods are provided for certain repetitive code.

      Overrides:
      equals in class MultiTermQuery
      See Also:
    • toString

      public String toString(String field)
      Description copied from class: Query
      Prints a query to a string, with field assumed to be the default field and omitted.
      Specified by:
      toString in class Query
    • visit

      public void visit(QueryVisitor visitor)
      Description copied from class: Query
      Recurse through the query tree, visiting any child queries
      Specified by:
      visit in class Query
      Parameters:
      visitor - a QueryVisitor to be called by each query in the tree
    • getAutomaton

      public Automaton getAutomaton()
      Returns the automaton used to create this query
    • isAutomatonBinary

      public boolean isAutomatonBinary()
      Is this a binary (byte) oriented automaton. See the constructor.
    • ramBytesUsed

      public long ramBytesUsed()
      Description copied from interface: Accountable
      Return the memory usage of this object in bytes. Negative values are illegal.
      Specified by:
      ramBytesUsed in interface Accountable