org.apache.lucene.analysis.shingle
Class ShingleAnalyzerWrapper
java.lang.Object
org.apache.lucene.analysis.Analyzer
org.apache.lucene.analysis.AnalyzerWrapper
org.apache.lucene.analysis.shingle.ShingleAnalyzerWrapper
- All Implemented Interfaces:
- Closeable
public final class ShingleAnalyzerWrapper
- extends AnalyzerWrapper
A ShingleAnalyzerWrapper wraps a ShingleFilter
around another Analyzer
.
A shingle is another name for a token based n-gram.
Constructor Summary |
ShingleAnalyzerWrapper(Analyzer defaultAnalyzer)
|
ShingleAnalyzerWrapper(Analyzer defaultAnalyzer,
int maxShingleSize)
|
ShingleAnalyzerWrapper(Analyzer defaultAnalyzer,
int minShingleSize,
int maxShingleSize)
|
ShingleAnalyzerWrapper(Analyzer delegate,
int minShingleSize,
int maxShingleSize,
String tokenSeparator,
boolean outputUnigrams,
boolean outputUnigramsIfNoShingles,
String fillerToken)
Creates a new ShingleAnalyzerWrapper |
ShingleAnalyzerWrapper(Version matchVersion)
Wraps StandardAnalyzer . |
ShingleAnalyzerWrapper(Version matchVersion,
int minShingleSize,
int maxShingleSize)
Wraps StandardAnalyzer . |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
ShingleAnalyzerWrapper
public ShingleAnalyzerWrapper(Analyzer defaultAnalyzer)
ShingleAnalyzerWrapper
public ShingleAnalyzerWrapper(Analyzer defaultAnalyzer,
int maxShingleSize)
ShingleAnalyzerWrapper
public ShingleAnalyzerWrapper(Analyzer defaultAnalyzer,
int minShingleSize,
int maxShingleSize)
ShingleAnalyzerWrapper
public ShingleAnalyzerWrapper(Analyzer delegate,
int minShingleSize,
int maxShingleSize,
String tokenSeparator,
boolean outputUnigrams,
boolean outputUnigramsIfNoShingles,
String fillerToken)
- Creates a new ShingleAnalyzerWrapper
- Parameters:
delegate
- Analyzer whose TokenStream is to be filteredminShingleSize
- Min shingle (token ngram) sizemaxShingleSize
- Max shingle sizetokenSeparator
- Used to separate input stream tokens in output shinglesoutputUnigrams
- Whether or not the filter shall pass the original
tokens to the output streamoutputUnigramsIfNoShingles
- Overrides the behavior of outputUnigrams==false for those
times when no shingles are available (because there are fewer than
minShingleSize tokens in the input stream)?
Note that if outputUnigrams==true, then unigrams are always output,
regardless of whether any shingles are available.fillerToken
- filler token to use when positionIncrement is more than 1
ShingleAnalyzerWrapper
public ShingleAnalyzerWrapper(Version matchVersion)
- Wraps
StandardAnalyzer
.
ShingleAnalyzerWrapper
public ShingleAnalyzerWrapper(Version matchVersion,
int minShingleSize,
int maxShingleSize)
- Wraps
StandardAnalyzer
.
getMaxShingleSize
public int getMaxShingleSize()
- The max shingle (token ngram) size
- Returns:
- The max shingle (token ngram) size
getMinShingleSize
public int getMinShingleSize()
- The min shingle (token ngram) size
- Returns:
- The min shingle (token ngram) size
getTokenSeparator
public String getTokenSeparator()
isOutputUnigrams
public boolean isOutputUnigrams()
isOutputUnigramsIfNoShingles
public boolean isOutputUnigramsIfNoShingles()
getFillerToken
public String getFillerToken()
getWrappedAnalyzer
public final Analyzer getWrappedAnalyzer(String fieldName)
- Specified by:
getWrappedAnalyzer
in class AnalyzerWrapper
wrapComponents
protected Analyzer.TokenStreamComponents wrapComponents(String fieldName,
Analyzer.TokenStreamComponents components)
- Overrides:
wrapComponents
in class AnalyzerWrapper
Copyright © 2000-2014 Apache Software Foundation. All Rights Reserved.