ProtectedTermFilterFactory (Lucene 8.5.0 API)

java.lang.Object
- org.apache.lucene.analysis.util.AbstractAnalysisFactory
- - org.apache.lucene.analysis.util.TokenFilterFactory
  - - org.apache.lucene.analysis.miscellaneous.ConditionalTokenFilterFactory
    - - org.apache.lucene.analysis.miscellaneous.ProtectedTermFilterFactory

All Implemented Interfaces:: ResourceLoaderAware

public class ProtectedTermFilterFactory
extends ConditionalTokenFilterFactory
implements ResourceLoaderAware

Factory for a ProtectedTermFilter

CustomAnalyzer example:

 Analyzer ana = CustomAnalyzer.builder()
   .withTokenizer("standard")
   .when("protectedterm", "ignoreCase", "true", "protected", "protectedTerms.txt")
     .addTokenFilter("truncate", "prefixLength", "4")
     .addTokenFilter("lowercase")
   .endwhen()
   .build();

Solr example, in which conditional filters are specified via the wrappedFilters parameter - a comma-separated list of case-insensitive TokenFilter SPI names - and conditional filter args are specified via filterName.argName parameters:

 <fieldType name="reverse_lower_with_exceptions" class="solr.TextField" positionIncrementGap="100">
   <analyzer>
     <tokenizer class="solr.WhitespaceTokenizerFactory"/>
     <filter class="solr.ProtectedTermFilterFactory" ignoreCase="true" protected="protectedTerms.txt"
             wrappedFilters="truncate,lowercase" truncate.prefixLength="4" />
   </analyzer>
 </fieldType>

When using the wrappedFilters parameter, each filter name must be unique, so if you need to specify the same filter more than once, you must add case-insensitive unique '-id' suffixes (note that the '-id' suffix is stripped prior to SPI lookup), e.g.:

 <fieldType name="double_synonym_with_exceptions" class="solr.TextField" positionIncrementGap="100">
   <analyzer>
     <tokenizer class="solr.WhitespaceTokenizerFactory"/>
     <filter class="solr.ProtectedTermFilterFactory" ignoreCase="true" protected="protectedTerms.txt"
             wrappedFilters="synonymgraph-A,synonymgraph-B"
             synonymgraph-A.synonyms="synonyms-1.txt"
             synonymgraph-B.synonyms="synonyms-2.txt"/>
   </analyzer>
 </fieldType>

Since:: 7.4.0
SPI Name (Note: This is case-insensitive. e.g., if the name is 'htmlStrip', 'htmlstrip' can be used when looking up the service):: "protectedTerm"

Field Summary

Fields
Modifier and Type Field and Description

static char FILTER_ARG_SEPARATOR

static char FILTER_NAME_ID_SEPARATOR

static String NAME

static String PROTECTED_TERMS
- Fields inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory
  LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion

Fields
Modifier and Type	Field and Description
`static char`	`FILTER_ARG_SEPARATOR`
`static char`	`FILTER_NAME_ID_SEPARATOR`
`static String`	`NAME`
`static String`	`PROTECTED_TERMS`

Constructor Summary

Constructors
Constructor and Description

ProtectedTermFilterFactory(Map<String,String> args)

Constructors
Constructor and Description
`ProtectedTermFilterFactory(Map<String,String> args)`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`protected ConditionalTokenFilter`	`create(TokenStream input, Function<TokenStream,TokenStream> inner)` Modify the incoming `TokenStream` with a `ConditionalTokenFilter`
`void`	`doInform(ResourceLoader loader)` Initialises this component with the corresponding `ResourceLoader`
`CharArraySet`	`getProtectedTerms()`
`boolean`	`isIgnoreCase()`

Methods inherited from class org.apache.lucene.analysis.miscellaneous.ConditionalTokenFilterFactory
create, inform, setInnerFilters

Methods inherited from class org.apache.lucene.analysis.util.TokenFilterFactory
availableTokenFilters, findSPIName, forName, lookupClass, normalize, reloadTokenFilters

Methods inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory
get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface org.apache.lucene.analysis.util.ResourceLoaderAware
inform

- Field Detail
  - NAME
```
public static final String NAME
```
    See Also:
    
    Constant Field Values
  - PROTECTED_TERMS
```
public static final String PROTECTED_TERMS
```
    See Also:
    
    Constant Field Values
  - FILTER_ARG_SEPARATOR
```
public static final char FILTER_ARG_SEPARATOR
```
    See Also:
    
    Constant Field Values
  - FILTER_NAME_ID_SEPARATOR
```
public static final char FILTER_NAME_ID_SEPARATOR
```
    See Also:
    
    Constant Field Values
- Constructor Detail
  - ProtectedTermFilterFactory
```
public ProtectedTermFilterFactory(Map<String,String> args)
```
- Method Detail
  - isIgnoreCase
```
public boolean isIgnoreCase()
```
  - getProtectedTerms
```
public CharArraySet getProtectedTerms()
```
  - create
```
protected ConditionalTokenFilter create(TokenStream input,
                                        Function<TokenStream,TokenStream> inner)
```
    Description copied from class: ConditionalTokenFilterFactory
    
    Modify the incoming TokenStream with a ConditionalTokenFilter
    
    Specified by:
    
    create in class ConditionalTokenFilterFactory
  - doInform
```
public void doInform(ResourceLoader loader)
              throws IOException
```
    Description copied from class: ConditionalTokenFilterFactory
    
    Initialises this component with the corresponding ResourceLoader
    
    Overrides:
    
    doInform in class ConditionalTokenFilterFactory
    
    Throws:
    
    IOException

Class ProtectedTermFilterFactory

Field Summary

Fields inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory

Constructor Summary

Method Summary

Methods inherited from class org.apache.lucene.analysis.miscellaneous.ConditionalTokenFilterFactory

Methods inherited from class org.apache.lucene.analysis.util.TokenFilterFactory

Methods inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory

Methods inherited from class java.lang.Object

Methods inherited from interface org.apache.lucene.analysis.util.ResourceLoaderAware

Field Detail

NAME

PROTECTED_TERMS

FILTER_ARG_SEPARATOR

FILTER_NAME_ID_SEPARATOR

Constructor Detail

ProtectedTermFilterFactory

Method Detail

isIgnoreCase

getProtectedTerms

create

doInform