Class PatternTypingFilterFactory
- java.lang.Object
-
- org.apache.lucene.analysis.AbstractAnalysisFactory
-
- org.apache.lucene.analysis.TokenFilterFactory
-
- org.apache.lucene.analysis.pattern.PatternTypingFilterFactory
-
- All Implemented Interfaces:
ResourceLoaderAware
public class PatternTypingFilterFactory extends TokenFilterFactory implements ResourceLoaderAware
Provides a filter that will analyze tokens with the analyzer from an arbitrary field type. By itself this filter is not very useful. Normally it is combined with a filter that reacts to types or flags.<fieldType name="text_taf" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="com.example.PatternTypingFilter" patternFile="patterns.txt"/> <filter class="solr.TokenAnalyzerFilter" asType="text_en" preserveType="true"/> <filter class="solr.TypeAsSynonymFilterFactory" prefix="__TAS__" ignore="word,<ALPHANUM>,<NUM>,<SOUTHEAST_ASIAN>,<IDEOGRAPHIC>,<HIRAGANA>,<KATAKANA>,<HANGUL>,<EMOJI>"/> </analyzer> </fieldType>
Note that a configuration such as above may interfere with multi-word synonyms. The patterns file has the format:
(flags) (pattern) ::: (replacement)
Therefore to set the first 2 flag bits on the original token matching 401k or 401(k) and adding a type of 'legal2_401_k' whenever either one is encountered one would use:3 (\d+)\(?([a-z])\)? ::: legal2_$1_$2
Note that the number indicating the flag bits to set must not have leading spaces and be followed by a single space, and must be 0 if no flags should be set. The flags number should not contain commas or a decimal point. Lines for which the first character is#
will be ignored as comments. Does not support producing a synonym textually identical to the original term.- Since:
- 8.8
- SPI Name (case-insensitive: if the name is 'htmlStrip', 'htmlstrip' can be used when looking up the service).
- "patternTyping"
-
-
Field Summary
Fields Modifier and Type Field Description static String
NAME
SPI name-
Fields inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
-
-
Constructor Summary
Constructors Constructor Description PatternTypingFilterFactory()
Default ctor for compatibility with SPIPatternTypingFilterFactory(Map<String,String> args)
Creates a new PatternTypingFilterFactory
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description TokenStream
create(TokenStream input)
void
inform(ResourceLoader loader)
-
Methods inherited from class org.apache.lucene.analysis.TokenFilterFactory
availableTokenFilters, findSPIName, forName, lookupClass, normalize, reloadTokenFilters
-
Methods inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
defaultCtorException, get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
-
-
-
-
Field Detail
-
NAME
public static final String NAME
SPI name- See Also:
- Constant Field Values
-
-
Method Detail
-
inform
public void inform(ResourceLoader loader) throws IOException
- Specified by:
inform
in interfaceResourceLoaderAware
- Throws:
IOException
-
create
public TokenStream create(TokenStream input)
- Specified by:
create
in classTokenFilterFactory
-
-