@Deprecated public class Lucene43HyphenationCompoundWordTokenFilter extends Lucene43CompoundWordTokenFilterBase
TokenFilter that decomposes compound words found in many Germanic languages,
using pre-4.4 behavior.Lucene43CompoundWordTokenFilterBase.CompoundTokenAttributeSource.StateDEFAULT_MAX_SUBWORD_SIZE, DEFAULT_MIN_SUBWORD_SIZE, DEFAULT_MIN_WORD_SIZE, dictionary, maxSubwordSize, minSubwordSize, minWordSize, offsetAtt, onlyLongestMatch, termAtt, tokensinputDEFAULT_TOKEN_ATTRIBUTE_FACTORY| Constructor and Description |
|---|
Lucene43HyphenationCompoundWordTokenFilter(TokenStream input,
HyphenationTree hyphenator)
Deprecated.
Create a HyphenationCompoundWordTokenFilter with no dictionary.
|
Lucene43HyphenationCompoundWordTokenFilter(TokenStream input,
HyphenationTree hyphenator,
CharArraySet dictionary)
Deprecated.
Creates a new
Lucene43HyphenationCompoundWordTokenFilter instance. |
Lucene43HyphenationCompoundWordTokenFilter(TokenStream input,
HyphenationTree hyphenator,
CharArraySet dictionary,
int minWordSize,
int minSubwordSize,
int maxSubwordSize,
boolean onlyLongestMatch)
Deprecated.
Creates a new
Lucene43HyphenationCompoundWordTokenFilter instance. |
Lucene43HyphenationCompoundWordTokenFilter(TokenStream input,
HyphenationTree hyphenator,
int minWordSize,
int minSubwordSize,
int maxSubwordSize)
Deprecated.
Create a HyphenationCompoundWordTokenFilter with no dictionary.
|
| Modifier and Type | Method and Description |
|---|---|
protected void |
decompose()
Deprecated.
Decomposes the current
Lucene43CompoundWordTokenFilterBase.termAtt and places Lucene43CompoundWordTokenFilterBase.CompoundToken instances in the Lucene43CompoundWordTokenFilterBase.tokens list. |
static HyphenationTree |
getHyphenationTree(InputSource hyphenationSource)
Deprecated.
Create a hyphenator tree
|
static HyphenationTree |
getHyphenationTree(String hyphenationFilename)
Deprecated.
Create a hyphenator tree
|
incrementToken, resetclose, endaddAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toStringpublic Lucene43HyphenationCompoundWordTokenFilter(TokenStream input, HyphenationTree hyphenator, CharArraySet dictionary)
Lucene43HyphenationCompoundWordTokenFilter instance.input - the TokenStream to processhyphenator - the hyphenation pattern tree to use for hyphenationdictionary - the word dictionary to match against.public Lucene43HyphenationCompoundWordTokenFilter(TokenStream input, HyphenationTree hyphenator, CharArraySet dictionary, int minWordSize, int minSubwordSize, int maxSubwordSize, boolean onlyLongestMatch)
Lucene43HyphenationCompoundWordTokenFilter instance.input - the TokenStream to processhyphenator - the hyphenation pattern tree to use for hyphenationdictionary - the word dictionary to match against.minWordSize - only words longer than this get processedminSubwordSize - only subwords longer than this get to the output streammaxSubwordSize - only subwords shorter than this get to the output streamonlyLongestMatch - Add only the longest matching subword to the streampublic Lucene43HyphenationCompoundWordTokenFilter(TokenStream input, HyphenationTree hyphenator, int minWordSize, int minSubwordSize, int maxSubwordSize)
public Lucene43HyphenationCompoundWordTokenFilter(TokenStream input, HyphenationTree hyphenator)
public static HyphenationTree getHyphenationTree(String hyphenationFilename) throws IOException
hyphenationFilename - the filename of the XML grammar to loadIOException - If there is a low-level I/O error.public static HyphenationTree getHyphenationTree(InputSource hyphenationSource) throws IOException
hyphenationSource - the InputSource pointing to the XML grammarIOException - If there is a low-level I/O error.protected void decompose()
Lucene43CompoundWordTokenFilterBaseLucene43CompoundWordTokenFilterBase.termAtt and places Lucene43CompoundWordTokenFilterBase.CompoundToken instances in the Lucene43CompoundWordTokenFilterBase.tokens list.
The original token may not be placed in the list, as it is automatically passed through this filter.decompose in class Lucene43CompoundWordTokenFilterBaseCopyright © 2000-2016 Apache Software Foundation. All Rights Reserved.