public final class CommonGramsFilter
extends org.apache.lucene.analysis.TokenFilter
PositionIncrementAttribute.setPositionIncrement(int)
. Bigrams have a type
of GRAM_TYPE
Example:
Constructor and Description |
---|
CommonGramsFilter(org.apache.lucene.analysis.TokenStream input,
Set<?> commonWords)
Deprecated.
Use
CommonGramsFilter(Version, TokenStream, Set) instead |
CommonGramsFilter(org.apache.lucene.analysis.TokenStream input,
Set<?> commonWords,
boolean ignoreCase)
Deprecated.
Use
CommonGramsFilter(Version, TokenStream, Set) instead |
CommonGramsFilter(org.apache.lucene.analysis.TokenStream input,
String[] commonWords)
Deprecated.
Use
CommonGramsFilter(Version, TokenStream, Set) instead. |
CommonGramsFilter(org.apache.lucene.analysis.TokenStream input,
String[] commonWords,
boolean ignoreCase)
Deprecated.
|
CommonGramsFilter(org.apache.lucene.util.Version matchVersion,
org.apache.lucene.analysis.TokenStream input,
Set<?> commonWords)
Construct a token stream filtering the given input using a Set of common
words to create bigrams.
|
CommonGramsFilter(org.apache.lucene.util.Version matchVersion,
org.apache.lucene.analysis.TokenStream input,
Set<?> commonWords,
boolean ignoreCase)
Deprecated.
Use
CommonGramsFilter(Version, TokenStream, Set) instead |
Modifier and Type | Method and Description |
---|---|
boolean |
incrementToken()
Inserts bigrams for common words into a token stream.
|
static org.apache.lucene.analysis.CharArraySet |
makeCommonSet(String[] commonWords)
Deprecated.
create a CharArraySet with CharArraySet instead
|
static org.apache.lucene.analysis.CharArraySet |
makeCommonSet(String[] commonWords,
boolean ignoreCase)
Deprecated.
create a CharArraySet with CharArraySet instead
|
void |
reset() |
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString
@Deprecated public CommonGramsFilter(org.apache.lucene.analysis.TokenStream input, Set<?> commonWords)
CommonGramsFilter(Version, TokenStream, Set)
instead@Deprecated public CommonGramsFilter(org.apache.lucene.analysis.TokenStream input, Set<?> commonWords, boolean ignoreCase)
CommonGramsFilter(Version, TokenStream, Set)
insteadpublic CommonGramsFilter(org.apache.lucene.util.Version matchVersion, org.apache.lucene.analysis.TokenStream input, Set<?> commonWords)
input
- TokenStream input in filter chaincommonWords
- The set of common words.@Deprecated public CommonGramsFilter(org.apache.lucene.util.Version matchVersion, org.apache.lucene.analysis.TokenStream input, Set<?> commonWords, boolean ignoreCase)
CommonGramsFilter(Version, TokenStream, Set)
insteadcommonWords
is an instance of
CharArraySet
(true if makeCommonSet()
was used to
construct the set) it will be directly used and ignoreCase
will be ignored since CharArraySet
directly controls case
sensitivity.
If commonWords
is not an instance of CharArraySet
, a
new CharArraySet will be constructed and ignoreCase
will be
used to specify the case sensitivity of that set.input
- TokenStream input in filter chain.commonWords
- The set of common words.ignoreCase
- -Ignore case when constructing bigrams for common words.@Deprecated public CommonGramsFilter(org.apache.lucene.analysis.TokenStream input, String[] commonWords)
CommonGramsFilter(Version, TokenStream, Set)
instead.input
- Tokenstream in filter chaincommonWords
- words to be used in constructing bigrams@Deprecated public CommonGramsFilter(org.apache.lucene.analysis.TokenStream input, String[] commonWords, boolean ignoreCase)
CommonGramsFilter(Version, TokenStream, Set, boolean)
instead.input
- Tokenstream in filter chaincommonWords
- words to be used in constructing bigramsignoreCase
- -Ignore case when constructing bigrams for common words.@Deprecated public static org.apache.lucene.analysis.CharArraySet makeCommonSet(String[] commonWords)
commonWords
- Array of common words which will be converted into the CharArraySetpassing false to ignoreCase
@Deprecated public static org.apache.lucene.analysis.CharArraySet makeCommonSet(String[] commonWords, boolean ignoreCase)
commonWords
- Array of common words which will be converted into the CharArraySetignoreCase
- If true, all words are lower cased first.public boolean incrementToken() throws IOException
incrementToken
in class org.apache.lucene.analysis.TokenStream
IOException
public void reset() throws IOException
reset
in class org.apache.lucene.analysis.TokenFilter
IOException