Package | Description |
---|---|
org.apache.lucene.analysis |
Text analysis.
|
org.apache.lucene.analysis.standard |
Fast, general-purpose grammar-based tokenizer
StandardTokenizer
implements the Word Break rules from the Unicode Text Segmentation algorithm, as specified in
Unicode Standard Annex #29. |
org.apache.lucene.codecs |
Codecs API: API for customization of the encoding and structure of the index.
|
org.apache.lucene.document |
The logical representation of a
Document for indexing and searching. |
org.apache.lucene.index |
Code to maintain and access indices.
|
org.apache.lucene.util |
Some utility classes.
|
org.apache.lucene.util.graph |
Utility classes for working with token streams as graphs.
|
Modifier and Type | Class and Description |
---|---|
class |
CachingTokenFilter
This class can be used if the token attributes of a TokenStream
are intended to be consumed more than once.
|
class |
FilteringTokenFilter
Abstract base class for TokenFilters that may remove tokens.
|
class |
GraphTokenFilter
An abstract TokenFilter that exposes its input stream as a graph
Call
GraphTokenFilter.incrementBaseToken() to move the root of the graph to the next
position in the TokenStream, GraphTokenFilter.incrementGraphToken() to move along
the current graph, and GraphTokenFilter.incrementGraph() to reset to the next graph
based at the current root. |
class |
LowerCaseFilter
Normalizes token text to lower case.
|
class |
StopFilter
Removes stop words from a token stream.
|
class |
TokenFilter
A TokenFilter is a TokenStream whose input is another TokenStream.
|
class |
Tokenizer
A Tokenizer is a TokenStream whose input is a Reader.
|
Modifier and Type | Field and Description |
---|---|
protected TokenStream |
TokenFilter.input
The source of tokens for this filter.
|
protected TokenStream |
Analyzer.TokenStreamComponents.sink
Sink tokenstream, such as the outer tokenfilter decorating
the chain.
|
Modifier and Type | Method and Description |
---|---|
TokenStream |
Analyzer.TokenStreamComponents.getTokenStream()
Returns the sink
TokenStream |
protected TokenStream |
Analyzer.normalize(String fieldName,
TokenStream in)
Wrap the given
TokenStream in order to apply normalization filters. |
protected TokenStream |
AnalyzerWrapper.normalize(String fieldName,
TokenStream in) |
TokenStream |
Analyzer.tokenStream(String fieldName,
Reader reader)
Returns a TokenStream suitable for
fieldName , tokenizing
the contents of reader . |
TokenStream |
Analyzer.tokenStream(String fieldName,
String text)
Returns a TokenStream suitable for
fieldName , tokenizing
the contents of text . |
protected TokenStream |
DelegatingAnalyzerWrapper.wrapTokenStreamForNormalization(String fieldName,
TokenStream in) |
protected TokenStream |
AnalyzerWrapper.wrapTokenStreamForNormalization(String fieldName,
TokenStream in)
Wraps / alters the given TokenStream for normalization purposes, taken
from the wrapped Analyzer, to form new components.
|
Modifier and Type | Method and Description |
---|---|
protected TokenStream |
Analyzer.normalize(String fieldName,
TokenStream in)
Wrap the given
TokenStream in order to apply normalization filters. |
protected TokenStream |
AnalyzerWrapper.normalize(String fieldName,
TokenStream in) |
Automaton |
TokenStreamToAutomaton.toAutomaton(TokenStream in)
Pulls the graph (including
PositionLengthAttribute ) from the provided TokenStream , and creates the corresponding
automaton where arcs are bytes (or Unicode code points
if unicodeArcs = true) from each term. |
protected TokenStream |
DelegatingAnalyzerWrapper.wrapTokenStreamForNormalization(String fieldName,
TokenStream in) |
protected TokenStream |
AnalyzerWrapper.wrapTokenStreamForNormalization(String fieldName,
TokenStream in)
Wraps / alters the given TokenStream for normalization purposes, taken
from the wrapped Analyzer, to form new components.
|
Constructor and Description |
---|
CachingTokenFilter(TokenStream input)
Create a new CachingTokenFilter around
input . |
FilteringTokenFilter(TokenStream in)
Create a new
FilteringTokenFilter . |
GraphTokenFilter(TokenStream input)
Create a new GraphTokenFilter
|
LowerCaseFilter(TokenStream in)
Create a new LowerCaseFilter, that normalizes token text to lower case.
|
StopFilter(TokenStream in,
CharArraySet stopWords)
Constructs a filter which removes words from the input TokenStream that are
named in the Set.
|
TokenFilter(TokenStream input)
Construct a token stream filtering the given input.
|
TokenStreamComponents(Consumer<Reader> source,
TokenStream result)
Creates a new
Analyzer.TokenStreamComponents instance. |
TokenStreamComponents(Tokenizer tokenizer,
TokenStream result)
Creates a new
Analyzer.TokenStreamComponents instance |
Modifier and Type | Class and Description |
---|---|
class |
StandardTokenizer
A grammar-based tokenizer constructed with JFlex.
|
Modifier and Type | Method and Description |
---|---|
protected TokenStream |
StandardAnalyzer.normalize(String fieldName,
TokenStream in) |
Modifier and Type | Method and Description |
---|---|
protected TokenStream |
StandardAnalyzer.normalize(String fieldName,
TokenStream in) |
Modifier and Type | Method and Description |
---|---|
TokenStream |
StoredFieldsWriter.MergeVisitor.tokenStream(Analyzer analyzer,
TokenStream reuse) |
Modifier and Type | Method and Description |
---|---|
TokenStream |
StoredFieldsWriter.MergeVisitor.tokenStream(Analyzer analyzer,
TokenStream reuse) |
Modifier and Type | Field and Description |
---|---|
protected TokenStream |
Field.tokenStream
Pre-analyzed tokenStream for indexed fields; this is
separate from fieldsData because you are allowed to
have both; eg maybe field has a String value but you
customize how it's tokenized
|
Modifier and Type | Method and Description |
---|---|
TokenStream |
Field.tokenStream(Analyzer analyzer,
TokenStream reuse) |
TokenStream |
FeatureField.tokenStream(Analyzer analyzer,
TokenStream reuse) |
TokenStream |
Field.tokenStreamValue()
The TokenStream for this field to be used when indexing, or null.
|
Modifier and Type | Method and Description |
---|---|
void |
Field.setTokenStream(TokenStream tokenStream)
Expert: sets the token stream to be used for indexing and causes
isIndexed() and isTokenized() to return true.
|
TokenStream |
Field.tokenStream(Analyzer analyzer,
TokenStream reuse) |
TokenStream |
FeatureField.tokenStream(Analyzer analyzer,
TokenStream reuse) |
Constructor and Description |
---|
Field(String name,
TokenStream tokenStream,
IndexableFieldType type)
Create field with TokenStream value.
|
TextField(String name,
TokenStream stream)
Creates a new un-stored TextField with TokenStream value.
|
Modifier and Type | Method and Description |
---|---|
TokenStream |
IndexableField.tokenStream(Analyzer analyzer,
TokenStream reuse)
Creates the TokenStream used for indexing this field.
|
Modifier and Type | Method and Description |
---|---|
TokenStream |
IndexableField.tokenStream(Analyzer analyzer,
TokenStream reuse)
Creates the TokenStream used for indexing this field.
|
Modifier and Type | Method and Description |
---|---|
protected Query |
QueryBuilder.analyzeBoolean(String field,
TokenStream stream)
Creates simple boolean query from the cached tokenstream contents
|
protected Query |
QueryBuilder.analyzeGraphBoolean(String field,
TokenStream source,
BooleanClause.Occur operator)
Creates a boolean query from a graph token stream.
|
protected Query |
QueryBuilder.analyzeGraphPhrase(TokenStream source,
String field,
int phraseSlop)
Creates graph phrase query from the tokenstream contents
|
protected Query |
QueryBuilder.analyzeMultiBoolean(String field,
TokenStream stream,
BooleanClause.Occur operator)
Creates complex boolean query from the cached tokenstream contents
|
protected Query |
QueryBuilder.analyzeMultiPhrase(String field,
TokenStream stream,
int slop)
Creates complex phrase query from the cached tokenstream contents
|
protected Query |
QueryBuilder.analyzePhrase(String field,
TokenStream stream,
int slop)
Creates simple phrase query from the cached tokenstream contents
|
protected Query |
QueryBuilder.analyzeTerm(String field,
TokenStream stream)
Creates simple term query from the cached tokenstream contents
|
protected Query |
QueryBuilder.createFieldQuery(TokenStream source,
BooleanClause.Occur operator,
String field,
boolean quoted,
int phraseSlop)
Creates a query from a token stream.
|
protected SpanQuery |
QueryBuilder.createSpanQuery(TokenStream in,
String field)
Creates a span query from the tokenstream.
|
Modifier and Type | Method and Description |
---|---|
Iterator<TokenStream> |
GraphTokenStreamFiniteStrings.getFiniteStrings()
Get all finite strings from the automaton.
|
Iterator<TokenStream> |
GraphTokenStreamFiniteStrings.getFiniteStrings(int startState,
int endState)
Get all finite strings that start at
startState and end at endState . |
Constructor and Description |
---|
GraphTokenStreamFiniteStrings(TokenStream in) |
Copyright © 2000-2019 Apache Software Foundation. All Rights Reserved.