Package | Description |
---|---|
org.apache.lucene.analysis |
Text analysis.
|
org.apache.lucene.analysis.standard |
Fast, general-purpose grammar-based tokenizer
StandardTokenizer
implements the Word Break rules from the Unicode Text Segmentation algorithm, as specified in
Unicode Standard Annex #29. |
org.apache.lucene.index |
Code to maintain and access indices.
|
org.apache.lucene.search |
Code to search indices.
|
org.apache.lucene.util |
Some utility classes.
|
org.apache.lucene.util.graph |
Utility classes for working with token streams as graphs.
|
Modifier and Type | Class and Description |
---|---|
class |
CachingTokenFilter
This class can be used if the token attributes of a TokenStream
are intended to be consumed more than once.
|
class |
FilteringTokenFilter
Abstract base class for TokenFilters that may remove tokens.
|
class |
GraphTokenFilter
An abstract TokenFilter that exposes its input stream as a graph
Call
GraphTokenFilter.incrementBaseToken() to move the root of the graph to the next
position in the TokenStream, GraphTokenFilter.incrementGraphToken() to move along
the current graph, and GraphTokenFilter.incrementGraph() to reset to the next graph
based at the current root. |
class |
LowerCaseFilter
Normalizes token text to lower case.
|
class |
StopFilter
Removes stop words from a token stream.
|
class |
TokenFilter
A TokenFilter is a TokenStream whose input is another TokenStream.
|
class |
Tokenizer
A Tokenizer is a TokenStream whose input is a Reader.
|
class |
TokenStream
|
Constructor and Description |
---|
TokenStream(AttributeSource input)
A TokenStream that uses the same attributes as the supplied one.
|
Modifier and Type | Class and Description |
---|---|
class |
StandardTokenizer
A grammar-based tokenizer constructed with JFlex.
|
Modifier and Type | Method and Description |
---|---|
AttributeSource |
BaseTermsEnum.attributes() |
AttributeSource |
FilterLeafReader.FilterTermsEnum.attributes() |
AttributeSource |
FilteredTermsEnum.attributes()
Returns the related attributes, the returned
AttributeSource
is shared with the delegate TermsEnum . |
abstract AttributeSource |
TermsEnum.attributes()
Returns the related attributes.
|
AttributeSource |
FieldInvertState.getAttributeSource()
Returns the
AttributeSource from the TokenStream that provided the indexed tokens for this
field. |
Modifier and Type | Method and Description |
---|---|
protected TermsEnum |
MultiTermQuery.RewriteMethod.getTermsEnum(MultiTermQuery query,
Terms terms,
AttributeSource atts)
Returns the
MultiTermQuery s TermsEnum |
protected TermsEnum |
AutomatonQuery.getTermsEnum(Terms terms,
AttributeSource atts) |
protected TermsEnum |
FuzzyQuery.getTermsEnum(Terms terms,
AttributeSource atts) |
protected abstract TermsEnum |
MultiTermQuery.getTermsEnum(Terms terms,
AttributeSource atts)
Construct the enumeration to be used, expanding the
pattern term.
|
Constructor and Description |
---|
FuzzyTermsEnum(Terms terms,
AttributeSource atts,
Term term,
int maxEdits,
int prefixLength,
boolean transpositions)
Constructor for enumeration of all terms from specified
reader which share a prefix of
length prefixLength with term and which have at most maxEdits edits. |
Modifier and Type | Method and Description |
---|---|
AttributeSource |
AttributeSource.cloneAttributes()
Performs a clone of all
AttributeImpl instances returned in a new
AttributeSource instance. |
Modifier and Type | Method and Description |
---|---|
void |
AttributeSource.copyTo(AttributeSource target)
Copies the contents of this
AttributeSource to the given target AttributeSource . |
Constructor and Description |
---|
AttributeSource(AttributeSource input)
An AttributeSource that uses the same attributes as the supplied one.
|
Modifier and Type | Method and Description |
---|---|
List<AttributeSource> |
GraphTokenStreamFiniteStrings.getTerms(int state)
Returns the list of tokens that start at the provided state
|
Copyright © 2000-2019 Apache Software Foundation. All Rights Reserved.