|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.lucene.analysis.Analyzer
public abstract class Analyzer
An Analyzer builds TokenStreams, which analyze text. It thus represents a policy for extracting index terms from text.
Typical implementations first build a Tokenizer, which breaks the stream of characters from the Reader into raw Tokens. One or more TokenFilters may then be applied to the output of the Tokenizer.
Field Summary | |
---|---|
protected boolean |
overridesTokenStreamMethod
|
Constructor Summary | |
---|---|
Analyzer()
|
Method Summary | |
---|---|
void |
close()
Frees persistent resources used by this Analyzer |
int |
getOffsetGap(Fieldable field)
Just like getPositionIncrementGap(java.lang.String) , except for
Token offsets instead. |
int |
getPositionIncrementGap(String fieldName)
Invoked before indexing a Fieldable instance if terms have already been added to that field. |
protected Object |
getPreviousTokenStream()
Used by Analyzers that implement reusableTokenStream to retrieve previously saved TokenStreams for re-use by the same thread. |
TokenStream |
reusableTokenStream(String fieldName,
Reader reader)
Creates a TokenStream that is allowed to be re-used from the previous time that the same thread called this method. |
protected void |
setOverridesTokenStreamMethod(Class baseClass)
Deprecated. This is only present to preserve back-compat of classes that subclass a core analyzer and override tokenStream but not reusableTokenStream |
protected void |
setPreviousTokenStream(Object obj)
Used by Analyzers that implement reusableTokenStream to save a TokenStream for later re-use by the same thread. |
abstract TokenStream |
tokenStream(String fieldName,
Reader reader)
Creates a TokenStream which tokenizes all the text in the provided Reader. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected boolean overridesTokenStreamMethod
Constructor Detail |
---|
public Analyzer()
Method Detail |
---|
public abstract TokenStream tokenStream(String fieldName, Reader reader)
public TokenStream reusableTokenStream(String fieldName, Reader reader) throws IOException
IOException
protected Object getPreviousTokenStream()
protected void setPreviousTokenStream(Object obj)
protected void setOverridesTokenStreamMethod(Class baseClass)
public int getPositionIncrementGap(String fieldName)
fieldName
- Fieldable name being indexed.
tokenStream(String,Reader)
public int getOffsetGap(Fieldable field)
getPositionIncrementGap(java.lang.String)
, except for
Token offsets instead. By default this returns 1 for
tokenized fields and, as if the fields were joined
with an extra space character, and 0 for un-tokenized
fields. This method is only called if the field
produced at least one token for indexing.
field
- the field just indexed
tokenStream(String,Reader)
public void close()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |