org.apache.lucene.analysis.fr
Class ElisionFilter
java.lang.Object
org.apache.lucene.util.AttributeSource
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.TokenFilter
org.apache.lucene.analysis.fr.ElisionFilter
- All Implemented Interfaces:
- Closeable
public final class ElisionFilter
- extends TokenFilter
Removes elisions from a TokenStream
. For example, "l'avion" (the plane) will be
tokenized as "avion" (plane).
Note that StandardTokenizer
sees " ' " as a space, and cuts it out.
- See Also:
- Elision in Wikipedia
Methods inherited from class org.apache.lucene.util.AttributeSource |
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, restoreState, toString |
ElisionFilter
protected ElisionFilter(TokenStream input)
- Constructs an elision filter with standard stop words
ElisionFilter
public ElisionFilter(TokenStream input,
Set<?> articles)
- Constructs an elision filter with a Set of stop words
ElisionFilter
public ElisionFilter(TokenStream input,
String[] articles)
- Constructs an elision filter with an array of stop words
setArticles
public void setArticles(Set<?> articles)
incrementToken
public final boolean incrementToken()
throws IOException
- Increments the
TokenStream
with a TermAttribute
without elisioned start
- Specified by:
incrementToken
in class TokenStream
- Returns:
- false for end of stream; true otherwise
- Throws:
IOException
Copyright © 2000-2010 Apache Software Foundation. All Rights Reserved.