|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.lucene.util.AttributeSource org.apache.lucene.analysis.TokenStream org.apache.lucene.analysis.TokenFilter org.apache.lucene.analysis.fr.ElisionFilter
public class ElisionFilter
Removes elisions from a TokenStream
. For example, "l'avion" (the plane) will be
tokenized as "avion" (plane).
Note that StandardTokenizer
sees " ' " as a space, and cuts it out.
Nested Class Summary |
---|
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource |
---|
AttributeSource.AttributeFactory, AttributeSource.State |
Field Summary |
---|
Fields inherited from class org.apache.lucene.analysis.TokenFilter |
---|
input |
Constructor Summary | |
---|---|
protected |
ElisionFilter(TokenStream input)
Constructs an elision filter with standard stop words |
|
ElisionFilter(TokenStream input,
Set articles)
Constructs an elision filter with a Set of stop words |
|
ElisionFilter(TokenStream input,
String[] articles)
Constructs an elision filter with an array of stop words |
Method Summary | |
---|---|
boolean |
incrementToken()
Increments the TokenStream with a TermAttribute without elisioned start |
Token |
next()
Deprecated. Will be removed in Lucene 3.0. This method is final, as it should not be overridden. Delegates to the backwards compatibility layer. |
Token |
next(Token reusableToken)
Deprecated. Will be removed in Lucene 3.0. This method is final, as it should not be overridden. Delegates to the backwards compatibility layer. |
void |
setArticles(Set articles)
|
Methods inherited from class org.apache.lucene.analysis.TokenFilter |
---|
close, end, reset |
Methods inherited from class org.apache.lucene.analysis.TokenStream |
---|
getOnlyUseNewAPI, setOnlyUseNewAPI |
Methods inherited from class org.apache.lucene.util.AttributeSource |
---|
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, restoreState, toString |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
protected ElisionFilter(TokenStream input)
public ElisionFilter(TokenStream input, Set articles)
public ElisionFilter(TokenStream input, String[] articles)
Method Detail |
---|
public void setArticles(Set articles)
public final boolean incrementToken() throws IOException
TokenStream
with a TermAttribute
without elisioned start
incrementToken
in class TokenStream
Note that this method will be defined abstract in Lucene 3.0.
IOException
public final Token next(Token reusableToken) throws IOException
TokenStream
This implicitly defines a "contract" between consumers (callers of this method) and producers (implementations of this method that are the source for tokens):
Token
before calling this method again.Token.clear()
before setting the fields in
it and returning itToken
after it
has been returned: the caller may arbitrarily change it. If the producer
needs to hold onto the Token
for subsequent calls, it must clone()
it before storing it. Note that a TokenFilter
is considered a
consumer.
next
in class TokenStream
reusableToken
- a Token
that may or may not be used to return;
this parameter should never be null (the callee is not required to
check for null before using it, but it is a good idea to assert that
it is not null.)
Token
in the stream or null if end-of-stream was hit
IOException
public final Token next() throws IOException
TokenStream
Token
in the stream, or null at EOS.
next
in class TokenStream
IOException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |