Package org.apache.lucene.analysis.core
Class WhitespaceTokenizerFactory
java.lang.Object
org.apache.lucene.analysis.AbstractAnalysisFactory
org.apache.lucene.analysis.TokenizerFactory
org.apache.lucene.analysis.core.WhitespaceTokenizerFactory
Factory for
WhitespaceTokenizer
.
<fieldType name="text_ws" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory" rule="unicode" maxTokenLen="256"/> </analyzer> </fieldType>Options:
- rule: either "java" for
WhitespaceTokenizer
or "unicode" forUnicodeWhitespaceTokenizer
- maxTokenLen: max token length, should be greater than 0 and less than
MAX_TOKEN_LENGTH_LIMIT (1024*1024). It is rare to need to change this else
CharTokenizer
::DEFAULT_MAX_TOKEN_LEN
- Since:
- 3.1
- SPI Name (case-insensitive: if the name is 'htmlStrip', 'htmlstrip' can be used when looking up the service).
- "whitespace"
-
Field Summary
Modifier and TypeFieldDescriptionstatic final String
SPI namestatic final String
static final String
Fields inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
-
Constructor Summary
ConstructorDescriptionDefault ctor for compatibility with SPICreates a new WhitespaceTokenizerFactory -
Method Summary
Methods inherited from class org.apache.lucene.analysis.TokenizerFactory
availableTokenizers, create, findSPIName, forName, lookupClass, reloadTokenizers
Methods inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
defaultCtorException, get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
-
Field Details
-
NAME
SPI name- See Also:
-
RULE_JAVA
- See Also:
-
RULE_UNICODE
- See Also:
-
-
Constructor Details
-
WhitespaceTokenizerFactory
Creates a new WhitespaceTokenizerFactory -
WhitespaceTokenizerFactory
public WhitespaceTokenizerFactory()Default ctor for compatibility with SPI
-
-
Method Details
-
create
- Specified by:
create
in classTokenizerFactory
-