public class MorfologikFilter extends TokenFilter
TokenFilter using Morfologik library to transform input tokens into lemma and
morphosyntactic (POS) tokens. Applies to Polish only.
MorfologikFilter contains a MorphosyntacticTagsAttribute, which provides morphosyntactic
annotations for produced lemmas. See the Morfologik documentation for details.
AttributeSource.StateDEFAULT_TOKEN_ATTRIBUTE_FACTORYDEFAULT_ATTRIBUTE_FACTORY| Constructor and Description |
|---|
MorfologikFilter(TokenStream in,
String dict,
Version version)
Creates a filter with a given dictionary resource.
|
MorfologikFilter(TokenStream in,
Version version)
Creates a filter with the default (Polish) dictionary.
|
| Modifier and Type | Method and Description |
|---|---|
boolean |
incrementToken()
Retrieves the next token (possibly from the list of lemmas).
|
void |
reset()
Resets stems accumulator and hands over to superclass.
|
close, endaddAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toStringpublic MorfologikFilter(TokenStream in, Version version)
public MorfologikFilter(TokenStream in, String dict, Version version)
in - input token stream.dict - Dictionary resource from classpath.version - Lucene version compatibility for lowercasing.public final boolean incrementToken()
throws IOException
incrementToken in class TokenStreamIOExceptionpublic void reset()
throws IOException
reset in class TokenFilterIOExceptionCopyright © 2000-2014 Apache Software Foundation. All Rights Reserved.