Class ProtectedTermFilterFactory
- java.lang.Object
-
- org.apache.lucene.analysis.util.AbstractAnalysisFactory
-
- org.apache.lucene.analysis.util.TokenFilterFactory
-
- org.apache.lucene.analysis.miscellaneous.ConditionalTokenFilterFactory
-
- org.apache.lucene.analysis.miscellaneous.ProtectedTermFilterFactory
-
- All Implemented Interfaces:
ResourceLoaderAware
public class ProtectedTermFilterFactory extends ConditionalTokenFilterFactory implements ResourceLoaderAware
Factory for aProtectedTermFilter
CustomAnalyzer example:
Analyzer ana = CustomAnalyzer.builder() .withTokenizer("standard") .when("protectedterm", "ignoreCase", "true", "protected", "protectedTerms.txt") .addTokenFilter("truncate", "prefixLength", "4") .addTokenFilter("lowercase") .endwhen() .build();
Solr example, in which conditional filters are specified via the
wrappedFilters
parameter - a comma-separated list of case-insensitive TokenFilter SPI names - and conditional filter args are specified viafilterName.argName
parameters:<fieldType name="reverse_lower_with_exceptions" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.ProtectedTermFilterFactory" ignoreCase="true" protected="protectedTerms.txt" wrappedFilters="truncate,lowercase" truncate.prefixLength="4" /> </analyzer> </fieldType>
When using the
wrappedFilters
parameter, each filter name must be unique, so if you need to specify the same filter more than once, you must add case-insensitive unique '-id' suffixes (note that the '-id' suffix is stripped prior to SPI lookup), e.g.:<fieldType name="double_synonym_with_exceptions" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.ProtectedTermFilterFactory" ignoreCase="true" protected="protectedTerms.txt" wrappedFilters="synonymgraph-A,synonymgraph-B" synonymgraph-A.synonyms="synonyms-1.txt" synonymgraph-B.synonyms="synonyms-2.txt"/> </analyzer> </fieldType>
See related
CustomAnalyzer.Builder.whenTerm(Predicate)
- Since:
- 7.4.0
- SPI Name (Note: This is case-insensitive. e.g., if the name is 'htmlStrip', 'htmlstrip' can be used when looking up the service):
- "protectedTerm"
-
-
Field Summary
Fields Modifier and Type Field Description static char
FILTER_ARG_SEPARATOR
static char
FILTER_NAME_ID_SEPARATOR
static String
NAME
static String
PROTECTED_TERMS
-
Fields inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
-
-
Constructor Summary
Constructors Constructor Description ProtectedTermFilterFactory(Map<String,String> args)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected ConditionalTokenFilter
create(TokenStream input, Function<TokenStream,TokenStream> inner)
Modify the incomingTokenStream
with aConditionalTokenFilter
void
doInform(ResourceLoader loader)
Initialises this component with the correspondingResourceLoader
CharArraySet
getProtectedTerms()
boolean
isIgnoreCase()
-
Methods inherited from class org.apache.lucene.analysis.miscellaneous.ConditionalTokenFilterFactory
create, inform, setInnerFilters
-
Methods inherited from class org.apache.lucene.analysis.util.TokenFilterFactory
availableTokenFilters, findSPIName, forName, lookupClass, normalize, reloadTokenFilters
-
Methods inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory
get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.lucene.analysis.util.ResourceLoaderAware
inform
-
-
-
-
Field Detail
-
NAME
public static final String NAME
- See Also:
- Constant Field Values
-
PROTECTED_TERMS
public static final String PROTECTED_TERMS
- See Also:
- Constant Field Values
-
FILTER_ARG_SEPARATOR
public static final char FILTER_ARG_SEPARATOR
- See Also:
- Constant Field Values
-
FILTER_NAME_ID_SEPARATOR
public static final char FILTER_NAME_ID_SEPARATOR
- See Also:
- Constant Field Values
-
-
Method Detail
-
isIgnoreCase
public boolean isIgnoreCase()
-
getProtectedTerms
public CharArraySet getProtectedTerms()
-
create
protected ConditionalTokenFilter create(TokenStream input, Function<TokenStream,TokenStream> inner)
Description copied from class:ConditionalTokenFilterFactory
Modify the incomingTokenStream
with aConditionalTokenFilter
- Specified by:
create
in classConditionalTokenFilterFactory
-
doInform
public void doInform(ResourceLoader loader) throws IOException
Description copied from class:ConditionalTokenFilterFactory
Initialises this component with the correspondingResourceLoader
- Overrides:
doInform
in classConditionalTokenFilterFactory
- Throws:
IOException
-
-