Package org.apache.lucene.analysis.core
Class WhitespaceTokenizerFactory
- java.lang.Object
-
- org.apache.lucene.analysis.util.AbstractAnalysisFactory
-
- org.apache.lucene.analysis.util.TokenizerFactory
-
- org.apache.lucene.analysis.core.WhitespaceTokenizerFactory
-
public class WhitespaceTokenizerFactory extends TokenizerFactory
Factory forWhitespaceTokenizer
.<fieldType name="text_ws" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory" rule="unicode" maxTokenLen="256"/> </analyzer> </fieldType>
Options:- rule: either "java" for
WhitespaceTokenizer
or "unicode" forUnicodeWhitespaceTokenizer
- maxTokenLen: max token length, should be greater than 0 and less than MAX_TOKEN_LENGTH_LIMIT (1024*1024).
It is rare to need to change this
else
CharTokenizer
::DEFAULT_MAX_TOKEN_LEN
- Since:
- 3.1
- SPI Name (Note: This is case-insensitive. e.g., if the name is 'htmlStrip', 'htmlstrip' can be used when looking up the service):
- "whitespace"
- rule: either "java" for
-
-
Field Summary
Fields Modifier and Type Field Description static String
NAME
SPI namestatic String
RULE_JAVA
static String
RULE_UNICODE
-
Fields inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
-
-
Constructor Summary
Constructors Constructor Description WhitespaceTokenizerFactory(Map<String,String> args)
Creates a new WhitespaceTokenizerFactory
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Tokenizer
create(AttributeFactory factory)
Creates a TokenStream of the specified input using the given AttributeFactory-
Methods inherited from class org.apache.lucene.analysis.util.TokenizerFactory
availableTokenizers, create, findSPIName, forName, lookupClass, reloadTokenizers
-
Methods inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory
get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
-
-
-
-
Field Detail
-
NAME
public static final String NAME
SPI name- See Also:
- Constant Field Values
-
RULE_JAVA
public static final String RULE_JAVA
- See Also:
- Constant Field Values
-
RULE_UNICODE
public static final String RULE_UNICODE
- See Also:
- Constant Field Values
-
-
Method Detail
-
create
public Tokenizer create(AttributeFactory factory)
Description copied from class:TokenizerFactory
Creates a TokenStream of the specified input using the given AttributeFactory- Specified by:
create
in classTokenizerFactory
-
-