public final class CommonGramsFilter extends TokenFilter
PositionIncrementAttribute.setPositionIncrement(int)
. Bigrams have a type
of GRAM_TYPE
Example:
AttributeSource.AttributeFactory, AttributeSource.State
input
Constructor and Description |
---|
CommonGramsFilter(TokenStream input,
Set<?> commonWords)
Deprecated.
Use
CommonGramsFilter(Version, TokenStream, Set) instead |
CommonGramsFilter(TokenStream input,
Set<?> commonWords,
boolean ignoreCase)
Deprecated.
Use
CommonGramsFilter(Version, TokenStream, Set) instead |
CommonGramsFilter(TokenStream input,
String[] commonWords)
Deprecated.
Use
CommonGramsFilter(Version, TokenStream, Set) instead. |
CommonGramsFilter(TokenStream input,
String[] commonWords,
boolean ignoreCase)
Deprecated.
|
CommonGramsFilter(Version matchVersion,
TokenStream input,
Set<?> commonWords)
Construct a token stream filtering the given input using a Set of common
words to create bigrams.
|
CommonGramsFilter(Version matchVersion,
TokenStream input,
Set<?> commonWords,
boolean ignoreCase)
Deprecated.
Use
CommonGramsFilter(Version, TokenStream, Set) instead |
Modifier and Type | Method and Description |
---|---|
boolean |
incrementToken()
Inserts bigrams for common words into a token stream.
|
static CharArraySet |
makeCommonSet(String[] commonWords)
Deprecated.
create a CharArraySet with CharArraySet instead
|
static CharArraySet |
makeCommonSet(String[] commonWords,
boolean ignoreCase)
Deprecated.
create a CharArraySet with CharArraySet instead
|
void |
reset() |
close, end
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString
@Deprecated public CommonGramsFilter(TokenStream input, Set<?> commonWords)
CommonGramsFilter(Version, TokenStream, Set)
instead@Deprecated public CommonGramsFilter(TokenStream input, Set<?> commonWords, boolean ignoreCase)
CommonGramsFilter(Version, TokenStream, Set)
insteadpublic CommonGramsFilter(Version matchVersion, TokenStream input, Set<?> commonWords)
input
- TokenStream input in filter chaincommonWords
- The set of common words.@Deprecated public CommonGramsFilter(Version matchVersion, TokenStream input, Set<?> commonWords, boolean ignoreCase)
CommonGramsFilter(Version, TokenStream, Set)
insteadcommonWords
is an instance of
CharArraySet
(true if makeCommonSet()
was used to
construct the set) it will be directly used and ignoreCase
will be ignored since CharArraySet
directly controls case
sensitivity.
If commonWords
is not an instance of CharArraySet
, a
new CharArraySet will be constructed and ignoreCase
will be
used to specify the case sensitivity of that set.input
- TokenStream input in filter chain.commonWords
- The set of common words.ignoreCase
- -Ignore case when constructing bigrams for common words.@Deprecated public CommonGramsFilter(TokenStream input, String[] commonWords)
CommonGramsFilter(Version, TokenStream, Set)
instead.input
- Tokenstream in filter chaincommonWords
- words to be used in constructing bigrams@Deprecated public CommonGramsFilter(TokenStream input, String[] commonWords, boolean ignoreCase)
CommonGramsFilter(Version, TokenStream, Set, boolean)
instead.input
- Tokenstream in filter chaincommonWords
- words to be used in constructing bigramsignoreCase
- -Ignore case when constructing bigrams for common words.@Deprecated public static CharArraySet makeCommonSet(String[] commonWords)
commonWords
- Array of common words which will be converted into the CharArraySetpassing false to ignoreCase
@Deprecated public static CharArraySet makeCommonSet(String[] commonWords, boolean ignoreCase)
commonWords
- Array of common words which will be converted into the CharArraySetignoreCase
- If true, all words are lower cased first.public boolean incrementToken() throws IOException
incrementToken
in class TokenStream
IOException
public void reset() throws IOException
reset
in class TokenFilter
IOException