public class MorfologikFilter extends TokenFilter
TokenFilter
using Morfologik library to transform input tokens into lemma and
morphosyntactic (POS) tokens. Applies to Polish only.
MorfologikFilter contains a MorphosyntacticTagsAttribute
, which provides morphosyntactic
annotations for produced lemmas. See the Morfologik documentation for details.
AttributeSource.AttributeFactory, AttributeSource.State
Constructor and Description |
---|
MorfologikFilter(TokenStream in,
String dict,
Version version)
Creates a filter with a given dictionary resource.
|
MorfologikFilter(TokenStream in,
Version version)
Creates a filter with the default (Polish) dictionary.
|
Modifier and Type | Method and Description |
---|---|
boolean |
incrementToken()
Retrieves the next token (possibly from the list of lemmas).
|
void |
reset()
Resets stems accumulator and hands over to superclass.
|
close, end
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString
public MorfologikFilter(TokenStream in, Version version)
public MorfologikFilter(TokenStream in, String dict, Version version)
in
- input token stream.dict
- Dictionary resource from classpath.version
- Lucene version compatibility for lowercasing.public final boolean incrementToken() throws IOException
incrementToken
in class TokenStream
IOException
public void reset() throws IOException
reset
in class TokenFilter
IOException
Copyright © 2000-2014 Apache Software Foundation. All Rights Reserved.