Class WikipediaTokenizerFactory
- java.lang.Object
-
- org.apache.lucene.analysis.util.AbstractAnalysisFactory
-
- org.apache.lucene.analysis.util.TokenizerFactory
-
- org.apache.lucene.analysis.wikipedia.WikipediaTokenizerFactory
-
public class WikipediaTokenizerFactory extends TokenizerFactory
Factory forWikipediaTokenizer
.<fieldType name="text_wiki" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WikipediaTokenizerFactory"/> </analyzer> </fieldType>
- Since:
- 3.1
- SPI Name (Note: This is case-insensitive. e.g., if the name is 'htmlStrip', 'htmlstrip' can be used when looking up the service):
- "wikipedia"
-
-
Field Summary
Fields Modifier and Type Field Description static String
NAME
SPI namestatic String
TOKEN_OUTPUT
protected int
tokenOutput
static String
UNTOKENIZED_TYPES
protected Set<String>
untokenizedTypes
-
Fields inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
-
-
Constructor Summary
Constructors Constructor Description WikipediaTokenizerFactory(Map<String,String> args)
Creates a new WikipediaTokenizerFactory
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description WikipediaTokenizer
create(AttributeFactory factory)
Creates a TokenStream of the specified input using the given AttributeFactory-
Methods inherited from class org.apache.lucene.analysis.util.TokenizerFactory
availableTokenizers, create, findSPIName, forName, lookupClass, reloadTokenizers
-
Methods inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory
get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
-
-
-
-
Field Detail
-
NAME
public static final String NAME
SPI name- See Also:
- Constant Field Values
-
TOKEN_OUTPUT
public static final String TOKEN_OUTPUT
- See Also:
- Constant Field Values
-
UNTOKENIZED_TYPES
public static final String UNTOKENIZED_TYPES
- See Also:
- Constant Field Values
-
tokenOutput
protected final int tokenOutput
-
-
Method Detail
-
create
public WikipediaTokenizer create(AttributeFactory factory)
Description copied from class:TokenizerFactory
Creates a TokenStream of the specified input using the given AttributeFactory- Specified by:
create
in classTokenizerFactory
-
-