@Deprecated public class Lucene43DictionaryCompoundWordTokenFilter extends Lucene43CompoundWordTokenFilterBase
TokenFilter that decomposes compound words found in many Germanic languages, using
pre-4.4 behavior.Lucene43CompoundWordTokenFilterBase.CompoundTokenAttributeSource.StateDEFAULT_MAX_SUBWORD_SIZE, DEFAULT_MIN_SUBWORD_SIZE, DEFAULT_MIN_WORD_SIZE, dictionary, maxSubwordSize, minSubwordSize, minWordSize, offsetAtt, onlyLongestMatch, termAtt, tokensinputDEFAULT_TOKEN_ATTRIBUTE_FACTORY| Constructor and Description |
|---|
Lucene43DictionaryCompoundWordTokenFilter(TokenStream input,
CharArraySet dictionary)
Deprecated.
Creates a new
Lucene43DictionaryCompoundWordTokenFilter |
Lucene43DictionaryCompoundWordTokenFilter(TokenStream input,
CharArraySet dictionary,
int minWordSize,
int minSubwordSize,
int maxSubwordSize,
boolean onlyLongestMatch)
Deprecated.
Creates a new
Lucene43DictionaryCompoundWordTokenFilter |
| Modifier and Type | Method and Description |
|---|---|
protected void |
decompose()
Deprecated.
Decomposes the current
Lucene43CompoundWordTokenFilterBase.termAtt and places Lucene43CompoundWordTokenFilterBase.CompoundToken instances in the Lucene43CompoundWordTokenFilterBase.tokens list. |
incrementToken, resetclose, endaddAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toStringpublic Lucene43DictionaryCompoundWordTokenFilter(TokenStream input, CharArraySet dictionary)
Lucene43DictionaryCompoundWordTokenFilterinput - the TokenStream to processdictionary - the word dictionary to match against.public Lucene43DictionaryCompoundWordTokenFilter(TokenStream input, CharArraySet dictionary, int minWordSize, int minSubwordSize, int maxSubwordSize, boolean onlyLongestMatch)
Lucene43DictionaryCompoundWordTokenFilterinput - the TokenStream to processdictionary - the word dictionary to match against.minWordSize - only words longer than this get processedminSubwordSize - only subwords longer than this get to the output streammaxSubwordSize - only subwords shorter than this get to the output streamonlyLongestMatch - Add only the longest matching subword to the streamprotected void decompose()
Lucene43CompoundWordTokenFilterBaseLucene43CompoundWordTokenFilterBase.termAtt and places Lucene43CompoundWordTokenFilterBase.CompoundToken instances in the Lucene43CompoundWordTokenFilterBase.tokens list.
The original token may not be placed in the list, as it is automatically passed through this filter.decompose in class Lucene43CompoundWordTokenFilterBaseCopyright © 2000-2016 Apache Software Foundation. All Rights Reserved.