public class ProtectedTermFilterFactory extends ConditionalTokenFilterFactory implements ResourceLoaderAware
ProtectedTermFilter
CustomAnalyzer example:
Analyzer ana = CustomAnalyzer.builder() .withTokenizer("standard") .when("protectedterm", "ignoreCase", "true", "protected", "protectedTerms.txt") .addTokenFilter("truncate", "prefixLength", "4") .addTokenFilter("lowercase") .endwhen() .build();
Solr example, in which conditional filters are specified via the wrappedFilters
parameter - a comma-separated list of case-insensitive TokenFilter SPI names - and conditional
filter args are specified via filterName.argName
parameters:
<fieldType name="reverse_lower_with_exceptions" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.ProtectedTermFilterFactory" ignoreCase="true" protected="protectedTerms.txt" wrappedFilters="truncate,lowercase" truncate.prefixLength="4" /> </analyzer> </fieldType>
When using the wrappedFilters
parameter, each filter name must be unique, so if you
need to specify the same filter more than once, you must add case-insensitive unique '-id' suffixes
(note that the '-id' suffix is stripped prior to SPI lookup), e.g.:
<fieldType name="double_synonym_with_exceptions" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.ProtectedTermFilterFactory" ignoreCase="true" protected="protectedTerms.txt" wrappedFilters="synonymgraph-A,synonymgraph-B" synonymgraph-A.synonyms="synonyms-1.txt" synonymgraph-B.synonyms="synonyms-2.txt"/> </analyzer> </fieldType>
See related CustomAnalyzer.Builder.whenTerm(Predicate)
Modifier and Type | Field and Description |
---|---|
static char |
FILTER_ARG_SEPARATOR |
static char |
FILTER_NAME_ID_SEPARATOR |
static String |
NAME |
static String |
PROTECTED_TERMS |
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
Constructor and Description |
---|
ProtectedTermFilterFactory(Map<String,String> args) |
Modifier and Type | Method and Description |
---|---|
protected ConditionalTokenFilter |
create(TokenStream input,
Function<TokenStream,TokenStream> inner)
Modify the incoming
TokenStream with a ConditionalTokenFilter |
void |
doInform(ResourceLoader loader)
Initialises this component with the corresponding
ResourceLoader |
CharArraySet |
getProtectedTerms() |
boolean |
isIgnoreCase() |
create, inform, setInnerFilters
availableTokenFilters, findSPIName, forName, lookupClass, normalize, reloadTokenFilters
get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
inform
public static final String NAME
public static final String PROTECTED_TERMS
public static final char FILTER_ARG_SEPARATOR
public static final char FILTER_NAME_ID_SEPARATOR
public boolean isIgnoreCase()
public CharArraySet getProtectedTerms()
protected ConditionalTokenFilter create(TokenStream input, Function<TokenStream,TokenStream> inner)
ConditionalTokenFilterFactory
TokenStream
with a ConditionalTokenFilter
create
in class ConditionalTokenFilterFactory
public void doInform(ResourceLoader loader) throws IOException
ConditionalTokenFilterFactory
ResourceLoader
doInform
in class ConditionalTokenFilterFactory
IOException
Copyright © 2000-2024 Apache Software Foundation. All Rights Reserved.