Package org.apache.lucene.analysis.core
Class WhitespaceTokenizerFactory
- java.lang.Object
-
- org.apache.lucene.analysis.AbstractAnalysisFactory
-
- org.apache.lucene.analysis.TokenizerFactory
-
- org.apache.lucene.analysis.core.WhitespaceTokenizerFactory
-
public class WhitespaceTokenizerFactory extends TokenizerFactory
Factory forWhitespaceTokenizer
.<fieldType name="text_ws" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory" rule="unicode" maxTokenLen="256"/> </analyzer> </fieldType>
Options:- rule: either "java" for
WhitespaceTokenizer
or "unicode" forUnicodeWhitespaceTokenizer
- maxTokenLen: max token length, should be greater than 0 and less than
MAX_TOKEN_LENGTH_LIMIT (1024*1024). It is rare to need to change this else
CharTokenizer
::DEFAULT_MAX_TOKEN_LEN
- Since:
- 3.1
- SPI Name (case-insensitive: if the name is 'htmlStrip', 'htmlstrip' can be used when looking up the service).
- "whitespace"
- rule: either "java" for
-
-
Field Summary
Fields Modifier and Type Field Description static String
NAME
SPI namestatic String
RULE_JAVA
static String
RULE_UNICODE
-
Fields inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
-
-
Constructor Summary
Constructors Constructor Description WhitespaceTokenizerFactory()
Default ctor for compatibility with SPIWhitespaceTokenizerFactory(Map<String,String> args)
Creates a new WhitespaceTokenizerFactory
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Tokenizer
create(AttributeFactory factory)
-
Methods inherited from class org.apache.lucene.analysis.TokenizerFactory
availableTokenizers, create, findSPIName, forName, lookupClass, reloadTokenizers
-
Methods inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
defaultCtorException, get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
-
-
-
-
Field Detail
-
NAME
public static final String NAME
SPI name- See Also:
- Constant Field Values
-
RULE_JAVA
public static final String RULE_JAVA
- See Also:
- Constant Field Values
-
RULE_UNICODE
public static final String RULE_UNICODE
- See Also:
- Constant Field Values
-
-
Method Detail
-
create
public Tokenizer create(AttributeFactory factory)
- Specified by:
create
in classTokenizerFactory
-
-