Class RegexpQueryHandler

  • All Implemented Interfaces:
    CustomQueryHandler

    public class RegexpQueryHandler
    extends Object
    implements CustomQueryHandler
    A query handler implementation that matches Regexp queries by indexing regex terms by their longest static substring, and generates ngrams from Document tokens to match them.

    This implementation will filter out more wildcard queries than TermFilteredPresearcher, at the expense of longer document build times. Which one is more performant will depend on the type and number of queries registered in the Monitor, and the size of documents to be monitored. Profiling is recommended.

    • Field Detail

      • DEFAULT_NGRAM_SUFFIX

        public static final String DEFAULT_NGRAM_SUFFIX
        The default suffix with which to mark ngrams
        See Also:
        Constant Field Values
      • DEFAULT_MAX_TOKEN_SIZE

        public static final int DEFAULT_MAX_TOKEN_SIZE
        The default maximum length of an input token before ANYTOKENS are generated
        See Also:
        Constant Field Values
      • DEFAULT_WILDCARD_TOKEN

        public static final String DEFAULT_WILDCARD_TOKEN
        The default token to emit if a term is longer than MAX_TOKEN_SIZE
        See Also:
        Constant Field Values
    • Constructor Detail

      • RegexpQueryHandler

        public RegexpQueryHandler​(String ngramSuffix,
                                  int maxTokenSize,
                                  String wildcardToken,
                                  Set<String> excludedFields)
        Creates a new RegexpQueryHandler
        Parameters:
        ngramSuffix - the suffix with which to mark ngrams
        maxTokenSize - the maximum length of an input token before WILDCARD tokens are generated
        wildcardToken - the token to emit if a token is longer than maxTokenSize in length
        excludedFields - a Set of fields to ignore when generating ngrams
      • RegexpQueryHandler

        public RegexpQueryHandler()
        Creates a new RegexpQueryHandler using default settings
      • RegexpQueryHandler

        public RegexpQueryHandler​(int maxTokenSize)
        Creates a new RegexpQueryHandler with a maximum token size
        Parameters:
        maxTokenSize - the maximum length of an input token before WILDCARD tokens are generated