Class WikipediaTokenizerFactory
- java.lang.Object
-
- org.apache.lucene.analysis.AbstractAnalysisFactory
-
- org.apache.lucene.analysis.TokenizerFactory
-
- org.apache.lucene.analysis.wikipedia.WikipediaTokenizerFactory
-
public class WikipediaTokenizerFactory extends TokenizerFactory
Factory forWikipediaTokenizer
.<fieldType name="text_wiki" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WikipediaTokenizerFactory"/> </analyzer> </fieldType>
- Since:
- 3.1
- SPI Name (case-insensitive: if the name is 'htmlStrip', 'htmlstrip' can be used when looking up the service).
- "wikipedia"
-
-
Field Summary
Fields Modifier and Type Field Description static String
NAME
SPI namestatic String
TOKEN_OUTPUT
protected int
tokenOutput
static String
UNTOKENIZED_TYPES
protected Set<String>
untokenizedTypes
-
Fields inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
-
-
Constructor Summary
Constructors Constructor Description WikipediaTokenizerFactory()
Default ctor for compatibility with SPIWikipediaTokenizerFactory(Map<String,String> args)
Creates a new WikipediaTokenizerFactory
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description WikipediaTokenizer
create(AttributeFactory factory)
-
Methods inherited from class org.apache.lucene.analysis.TokenizerFactory
availableTokenizers, create, findSPIName, forName, lookupClass, reloadTokenizers
-
Methods inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
defaultCtorException, get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
-
-
-
-
Field Detail
-
NAME
public static final String NAME
SPI name- See Also:
- Constant Field Values
-
TOKEN_OUTPUT
public static final String TOKEN_OUTPUT
- See Also:
- Constant Field Values
-
UNTOKENIZED_TYPES
public static final String UNTOKENIZED_TYPES
- See Also:
- Constant Field Values
-
tokenOutput
protected final int tokenOutput
-
-
Method Detail
-
create
public WikipediaTokenizer create(AttributeFactory factory)
- Specified by:
create
in classTokenizerFactory
-
-