Class PatternTypingFilterFactory

All Implemented Interfaces:
ResourceLoaderAware

public class PatternTypingFilterFactory extends TokenFilterFactory implements ResourceLoaderAware
Provides a filter that will analyze tokens with the analyzer from an arbitrary field type. By itself this filter is not very useful. Normally it is combined with a filter that reacts to types or flags.
 <fieldType name="text_taf" class="solr.TextField" positionIncrementGap="100">
   <analyzer>
     <tokenizer class="solr.WhitespaceTokenizerFactory"/>
     <filter class="com.example.PatternTypingFilter" patternFile="patterns.txt"/>
     <filter class="solr.TokenAnalyzerFilter" asType="text_en" preserveType="true"/>
     <filter class="solr.TypeAsSynonymFilterFactory" prefix="__TAS__"
               ignore="word,&lt;ALPHANUM&gt;,&lt;NUM&gt;,&lt;SOUTHEAST_ASIAN&gt;,&lt;IDEOGRAPHIC&gt;,&lt;HIRAGANA&gt;,&lt;KATAKANA&gt;,&lt;HANGUL&gt;,&lt;EMOJI&gt;"/>
   </analyzer>
 </fieldType>

Note that a configuration such as above may interfere with multi-word synonyms. The patterns file has the format:

 (flags) (pattern) ::: (replacement)
 
Therefore to set the first 2 flag bits on the original token matching 401k or 401(k) and adding a type of 'legal2_401_k' whenever either one is encountered one would use:
 3 (\d+)\(?([a-z])\)? ::: legal2_$1_$2
 
Note that the number indicating the flag bits to set must not have leading spaces and be followed by a single space, and must be 0 if no flags should be set. The flags number should not contain commas or a decimal point. Lines for which the first character is # will be ignored as comments. Does not support producing a synonym textually identical to the original term.
Since:
8.8
SPI Name (case-insensitive: if the name is 'htmlStrip', 'htmlstrip' can be used when looking up the service).
"patternTyping"