Class PatternTypingFilterFactory
java.lang.Object
org.apache.lucene.analysis.AbstractAnalysisFactory
org.apache.lucene.analysis.TokenFilterFactory
org.apache.lucene.analysis.pattern.PatternTypingFilterFactory
- All Implemented Interfaces:
ResourceLoaderAware
Provides a filter that will analyze tokens with the analyzer from an arbitrary field type. By
itself this filter is not very useful. Normally it is combined with a filter that reacts to types
or flags.
<fieldType name="text_taf" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="com.example.PatternTypingFilter" patternFile="patterns.txt"/> <filter class="solr.TokenAnalyzerFilter" asType="text_en" preserveType="true"/> <filter class="solr.TypeAsSynonymFilterFactory" prefix="__TAS__" ignore="word,<ALPHANUM>,<NUM>,<SOUTHEAST_ASIAN>,<IDEOGRAPHIC>,<HIRAGANA>,<KATAKANA>,<HANGUL>,<EMOJI>"/> </analyzer> </fieldType>
Note that a configuration such as above may interfere with multi-word synonyms. The patterns file has the format:
(flags) (pattern) ::: (replacement)Therefore to set the first 2 flag bits on the original token matching 401k or 401(k) and adding a type of 'legal2_401_k' whenever either one is encountered one would use:
3 (\d+)\(?([a-z])\)? ::: legal2_$1_$2Note that the number indicating the flag bits to set must not have leading spaces and be followed by a single space, and must be 0 if no flags should be set. The flags number should not contain commas or a decimal point. Lines for which the first character is
#
will be ignored
as comments. Does not support producing a synonym textually identical to the original term.- Since:
- 8.8
- SPI Name (case-insensitive: if the name is 'htmlStrip', 'htmlstrip' can be used when looking up the service).
- "patternTyping"
-
Field Summary
Fields inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
-
Constructor Summary
ConstructorDescriptionDefault ctor for compatibility with SPICreates a new PatternTypingFilterFactory -
Method Summary
Methods inherited from class org.apache.lucene.analysis.TokenFilterFactory
availableTokenFilters, findSPIName, forName, lookupClass, normalize, reloadTokenFilters
Methods inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
defaultCtorException, get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
-
Field Details
-
NAME
SPI name- See Also:
-
-
Constructor Details
-
PatternTypingFilterFactory
Creates a new PatternTypingFilterFactory -
PatternTypingFilterFactory
public PatternTypingFilterFactory()Default ctor for compatibility with SPI
-
-
Method Details
-
inform
- Specified by:
inform
in interfaceResourceLoaderAware
- Throws:
IOException
-
create
- Specified by:
create
in classTokenFilterFactory
-