NGramTokenizerFactory (Solr 3.6.0 API)

java.lang.Object
- org.apache.solr.analysis.BaseTokenizerFactory
- - org.apache.solr.analysis.NGramTokenizerFactory

All Implemented Interfaces:: TokenizerFactory

public class NGramTokenizerFactory
extends BaseTokenizerFactory

Factory for NGramTokenizer.

 <fieldType name="text_ngrm" class="solr.TextField" positionIncrementGap="100">
   <analyzer>
     <tokenizer class="solr.NGramTokenizerFactory" minGramSize="1" maxGramSize="2"/>
   </analyzer>
 </fieldType>

Version:: $Id$

Field Summary

Fields
Modifier and Type Field and Description

protected Map<String,String> args
The init args

protected Version luceneMatchVersion
the luceneVersion arg
- Fields inherited from class org.apache.solr.analysis.BaseTokenizerFactory
  log

Fields
Modifier and Type	Field and Description
`protected Map<String,String>`	`args` The init args
`protected Version`	`luceneMatchVersion` the luceneVersion arg

Constructor Summary

Constructors
Constructor and Description

NGramTokenizerFactory()

Constructors
Constructor and Description
`NGramTokenizerFactory()`

Method Summary

Methods
Modifier and Type	Method and Description
`protected void`	`assureMatchVersion()` this method can be called in the `TokenizerFactory.create(java.io.Reader)` or `TokenFilterFactory.create(org.apache.lucene.analysis.TokenStream)` methods, to inform user, that for this factory a `luceneMatchVersion` is required
`NGramTokenizer`	`create(Reader input)` Creates the `TokenStream` of n-grams from the given `Reader`.
`Map<String,String>`	`getArgs()`
`protected boolean`	`getBoolean(String name, boolean defaultVal)`
`protected boolean`	`getBoolean(String name, boolean defaultVal, boolean useDefault)`
`protected int`	`getInt(String name)`
`protected int`	`getInt(String name, int defaultVal)`
`protected int`	`getInt(String name, int defaultVal, boolean useDefault)`
`protected CharArraySet`	`getSnowballWordSet(ResourceLoader loader, String wordFiles, boolean ignoreCase)` same as `getWordSet(ResourceLoader, String, boolean)`, except the input is in snowball format.
`protected CharArraySet`	`getWordSet(ResourceLoader loader, String wordFiles, boolean ignoreCase)`
`void`	`init(Map<String,String> args)` Initializes the n-gram min and max sizes and the side from which one should start tokenizing.
`protected void`	`warnDeprecated(String message)`

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface org.apache.solr.analysis.TokenizerFactory
getArgs

Field Detail

args
```
protected Map<String,String> args
```
The init args

luceneMatchVersion
```
protected Version luceneMatchVersion
```
the luceneVersion arg

Constructor Detail
- NGramTokenizerFactory
```
public NGramTokenizerFactory()
```

Method Detail

init
```
public void init(Map<String,String> args)
```
Initializes the n-gram min and max sizes and the side from which one should start tokenizing.

Specified by:

init in interface TokenizerFactory

create
```
public NGramTokenizer create(Reader input)
```
Creates the TokenStream of n-grams from the given Reader.

getArgs
```
public Map<String,String> getArgs()
```

assureMatchVersion
```
protected final void assureMatchVersion()
```
this method can be called in the TokenizerFactory.create(java.io.Reader) or TokenFilterFactory.create(org.apache.lucene.analysis.TokenStream) methods, to inform user, that for this factory a luceneMatchVersion is required

warnDeprecated

protected final void warnDeprecated(String message)

getInt
```
protected int getInt(String name)
```

getInt

protected int getInt(String name,
         int defaultVal)

getInt

protected int getInt(String name,
         int defaultVal,
         boolean useDefault)

getBoolean

protected boolean getBoolean(String name,
                 boolean defaultVal)

getBoolean

protected boolean getBoolean(String name,
                 boolean defaultVal,
                 boolean useDefault)

getWordSet

protected CharArraySet getWordSet(ResourceLoader loader,
                      String wordFiles,
                      boolean ignoreCase)
                           throws IOException

Throws:: IOException

getSnowballWordSet

protected CharArraySet getSnowballWordSet(ResourceLoader loader,
                              String wordFiles,
                              boolean ignoreCase)
                                   throws IOException

same as getWordSet(ResourceLoader, String, boolean), except the input is in snowball format.

Throws:: IOException

Class NGramTokenizerFactory

Field Summary

Fields inherited from class org.apache.solr.analysis.BaseTokenizerFactory

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Methods inherited from interface org.apache.solr.analysis.TokenizerFactory

Field Detail

args

luceneMatchVersion

Constructor Detail

NGramTokenizerFactory

Method Detail

init

create

getArgs

assureMatchVersion

warnDeprecated

getInt

getInt

getInt

getBoolean

getBoolean

getWordSet

getSnowballWordSet