Class CapitalizationFilterFactory
- java.lang.Object
-
- org.apache.lucene.analysis.util.AbstractAnalysisFactory
-
- org.apache.lucene.analysis.util.TokenFilterFactory
-
- org.apache.lucene.analysis.miscellaneous.CapitalizationFilterFactory
-
public class CapitalizationFilterFactory extends TokenFilterFactory
Factory forCapitalizationFilter
.The factory takes parameters:
- "onlyFirstWord" - should each word be capitalized or all of the words?
- "keep" - a keep word list. Each word that should be kept separated by whitespace.
- "keepIgnoreCase - true or false. If true, the keep list will be considered case-insensitive.
- "forceFirstLetter" - Force the first letter to be capitalized even if it is in the keep list
- "okPrefix" - do not change word capitalization if a word begins with something in this list. for example if "McK" is on the okPrefix list, the word "McKinley" should not be changed to "Mckinley"
- "minWordLength" - how long the word needs to be to get capitalization applied. If the minWordLength is 3, "and" > "And" but "or" stays "or"
- "maxWordCount" - if the token contains more then maxWordCount words, the capitalization is assumed to be correct.
<fieldType name="text_cptlztn" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.CapitalizationFilterFactory" onlyFirstWord="true" keep="java solr lucene" keepIgnoreCase="false" okPrefix="McK McD McA"/> </analyzer> </fieldType>
- Since:
- solr 1.3
- SPI Name (Note: This is case-insensitive. e.g., if the name is 'htmlStrip', 'htmlstrip' can be used when looking up the service):
- "capitalization"
-
-
Field Summary
Fields Modifier and Type Field Description static String
FORCE_FIRST_LETTER
static String
KEEP
static String
KEEP_IGNORE_CASE
static String
MAX_TOKEN_LENGTH
static String
MAX_WORD_COUNT
static String
MIN_WORD_LENGTH
static String
NAME
SPI namestatic String
OK_PREFIX
static String
ONLY_FIRST_WORD
-
Fields inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
-
-
Constructor Summary
Constructors Constructor Description CapitalizationFilterFactory(Map<String,String> args)
Creates a new CapitalizationFilterFactory
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description CapitalizationFilter
create(TokenStream input)
Transform the specified input TokenStream-
Methods inherited from class org.apache.lucene.analysis.util.TokenFilterFactory
availableTokenFilters, findSPIName, forName, lookupClass, normalize, reloadTokenFilters
-
Methods inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory
get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
-
-
-
-
Field Detail
-
NAME
public static final String NAME
SPI name- See Also:
- Constant Field Values
-
KEEP
public static final String KEEP
- See Also:
- Constant Field Values
-
KEEP_IGNORE_CASE
public static final String KEEP_IGNORE_CASE
- See Also:
- Constant Field Values
-
OK_PREFIX
public static final String OK_PREFIX
- See Also:
- Constant Field Values
-
MIN_WORD_LENGTH
public static final String MIN_WORD_LENGTH
- See Also:
- Constant Field Values
-
MAX_WORD_COUNT
public static final String MAX_WORD_COUNT
- See Also:
- Constant Field Values
-
MAX_TOKEN_LENGTH
public static final String MAX_TOKEN_LENGTH
- See Also:
- Constant Field Values
-
ONLY_FIRST_WORD
public static final String ONLY_FIRST_WORD
- See Also:
- Constant Field Values
-
FORCE_FIRST_LETTER
public static final String FORCE_FIRST_LETTER
- See Also:
- Constant Field Values
-
-
Method Detail
-
create
public CapitalizationFilter create(TokenStream input)
Description copied from class:TokenFilterFactory
Transform the specified input TokenStream- Specified by:
create
in classTokenFilterFactory
-
-