Package | Description |
---|---|
org.apache.lucene.analysis |
Text analysis.
|
org.apache.lucene.analysis.standard |
Fast, general-purpose grammar-based tokenizer
StandardTokenizer
implements the Word Break rules from the Unicode Text Segmentation algorithm, as specified in
Unicode Standard Annex #29. |
org.apache.lucene.codecs |
Codecs API: API for customization of the encoding and structure of the index.
|
org.apache.lucene.document |
The logical representation of a
Document for indexing and searching. |
org.apache.lucene.index |
Code to maintain and access indices.
|
org.apache.lucene.util |
Some utility classes.
|
org.apache.lucene.util.graph |
Utility classes for working with token streams as graphs.
|
Modifier and Type | Class and Description |
---|---|
class |
CachingTokenFilter
This class can be used if the token attributes of a TokenStream
are intended to be consumed more than once.
|
class |
FilteringTokenFilter
Abstract base class for TokenFilters that may remove tokens.
|
class |
LowerCaseFilter
Normalizes token text to lower case.
|
class |
StopFilter
Removes stop words from a token stream.
|
class |
TokenFilter
A TokenFilter is a TokenStream whose input is another TokenStream.
|
class |
Tokenizer
A Tokenizer is a TokenStream whose input is a Reader.
|
Modifier and Type | Field and Description |
---|---|
protected TokenStream |
TokenFilter.input
The source of tokens for this filter.
|
protected TokenStream |
Analyzer.TokenStreamComponents.sink
Sink tokenstream, such as the outer tokenfilter decorating
the chain.
|
Modifier and Type | Method and Description |
---|---|
TokenStream |
Analyzer.TokenStreamComponents.getTokenStream()
Returns the sink
TokenStream |
protected TokenStream |
Analyzer.normalize(String fieldName,
TokenStream in)
Wrap the given
TokenStream in order to apply normalization filters. |
protected TokenStream |
AnalyzerWrapper.normalize(String fieldName,
TokenStream in) |
TokenStream |
Analyzer.tokenStream(String fieldName,
Reader reader)
Returns a TokenStream suitable for
fieldName , tokenizing
the contents of reader . |
TokenStream |
Analyzer.tokenStream(String fieldName,
String text)
Returns a TokenStream suitable for
fieldName , tokenizing
the contents of text . |
protected TokenStream |
DelegatingAnalyzerWrapper.wrapTokenStreamForNormalization(String fieldName,
TokenStream in) |
protected TokenStream |
AnalyzerWrapper.wrapTokenStreamForNormalization(String fieldName,
TokenStream in)
Wraps / alters the given TokenStream for normalization purposes, taken
from the wrapped Analyzer, to form new components.
|
Modifier and Type | Method and Description |
---|---|
protected TokenStream |
Analyzer.normalize(String fieldName,
TokenStream in)
Wrap the given
TokenStream in order to apply normalization filters. |
protected TokenStream |
AnalyzerWrapper.normalize(String fieldName,
TokenStream in) |
Automaton |
TokenStreamToAutomaton.toAutomaton(TokenStream in)
Pulls the graph (including
PositionLengthAttribute ) from the provided TokenStream , and creates the corresponding
automaton where arcs are bytes (or Unicode code points
if unicodeArcs = true) from each term. |
protected TokenStream |
DelegatingAnalyzerWrapper.wrapTokenStreamForNormalization(String fieldName,
TokenStream in) |
protected TokenStream |
AnalyzerWrapper.wrapTokenStreamForNormalization(String fieldName,
TokenStream in)
Wraps / alters the given TokenStream for normalization purposes, taken
from the wrapped Analyzer, to form new components.
|
Constructor and Description |
---|
CachingTokenFilter(TokenStream input)
Create a new CachingTokenFilter around
input . |
FilteringTokenFilter(TokenStream in)
Create a new
FilteringTokenFilter . |
LowerCaseFilter(TokenStream in)
Create a new LowerCaseFilter, that normalizes token text to lower case.
|
StopFilter(TokenStream in,
CharArraySet stopWords)
Constructs a filter which removes words from the input TokenStream that are
named in the Set.
|
TokenFilter(TokenStream input)
Construct a token stream filtering the given input.
|
TokenStreamComponents(Tokenizer source,
TokenStream result)
Creates a new
Analyzer.TokenStreamComponents instance. |
Modifier and Type | Class and Description |
---|---|
class |
StandardFilter
Deprecated.
StandardFilter is a no-op and can be removed from code
|
class |
StandardTokenizer
A grammar-based tokenizer constructed with JFlex.
|
Modifier and Type | Method and Description |
---|---|
protected TokenStream |
StandardAnalyzer.normalize(String fieldName,
TokenStream in) |
Modifier and Type | Method and Description |
---|---|
protected TokenStream |
StandardAnalyzer.normalize(String fieldName,
TokenStream in) |
Constructor and Description |
---|
StandardFilter(TokenStream in)
Deprecated.
Sole constructor
|
Modifier and Type | Method and Description |
---|---|
TokenStream |
StoredFieldsWriter.MergeVisitor.tokenStream(Analyzer analyzer,
TokenStream reuse) |
Modifier and Type | Method and Description |
---|---|
TokenStream |
StoredFieldsWriter.MergeVisitor.tokenStream(Analyzer analyzer,
TokenStream reuse) |
Modifier and Type | Field and Description |
---|---|
protected TokenStream |
Field.tokenStream
Pre-analyzed tokenStream for indexed fields; this is
separate from fieldsData because you are allowed to
have both; eg maybe field has a String value but you
customize how it's tokenized
|
Modifier and Type | Method and Description |
---|---|
TokenStream |
Field.tokenStream(Analyzer analyzer,
TokenStream reuse) |
TokenStream |
FeatureField.tokenStream(Analyzer analyzer,
TokenStream reuse) |
TokenStream |
Field.tokenStreamValue()
The TokenStream for this field to be used when indexing, or null.
|
Modifier and Type | Method and Description |
---|---|
void |
Field.setTokenStream(TokenStream tokenStream)
Expert: sets the token stream to be used for indexing and causes
isIndexed() and isTokenized() to return true.
|
TokenStream |
Field.tokenStream(Analyzer analyzer,
TokenStream reuse) |
TokenStream |
FeatureField.tokenStream(Analyzer analyzer,
TokenStream reuse) |
Constructor and Description |
---|
Field(String name,
TokenStream tokenStream,
IndexableFieldType type)
Create field with TokenStream value.
|
TextField(String name,
TokenStream stream)
Creates a new un-stored TextField with TokenStream value.
|
Modifier and Type | Method and Description |
---|---|
TokenStream |
IndexableField.tokenStream(Analyzer analyzer,
TokenStream reuse)
Creates the TokenStream used for indexing this field.
|
Modifier and Type | Method and Description |
---|---|
TokenStream |
IndexableField.tokenStream(Analyzer analyzer,
TokenStream reuse)
Creates the TokenStream used for indexing this field.
|
Modifier and Type | Method and Description |
---|---|
protected Query |
QueryBuilder.analyzeBoolean(String field,
TokenStream stream)
Creates simple boolean query from the cached tokenstream contents
|
protected Query |
QueryBuilder.analyzeGraphBoolean(String field,
TokenStream source,
BooleanClause.Occur operator)
Creates a boolean query from a graph token stream.
|
protected Query |
QueryBuilder.analyzeGraphPhrase(TokenStream source,
String field,
int phraseSlop)
Creates graph phrase query from the tokenstream contents
|
protected Query |
QueryBuilder.analyzeMultiBoolean(String field,
TokenStream stream,
BooleanClause.Occur operator)
Creates complex boolean query from the cached tokenstream contents
|
protected Query |
QueryBuilder.analyzeMultiPhrase(String field,
TokenStream stream,
int slop)
Creates complex phrase query from the cached tokenstream contents
|
protected Query |
QueryBuilder.analyzePhrase(String field,
TokenStream stream,
int slop)
Creates simple phrase query from the cached tokenstream contents
|
protected Query |
QueryBuilder.analyzeTerm(String field,
TokenStream stream)
Creates simple term query from the cached tokenstream contents
|
protected Query |
QueryBuilder.createFieldQuery(TokenStream source,
BooleanClause.Occur operator,
String field,
boolean quoted,
int phraseSlop)
Creates a query from a token stream.
|
protected SpanQuery |
QueryBuilder.createSpanQuery(TokenStream in,
String field)
Creates a span query from the tokenstream.
|
Modifier and Type | Method and Description |
---|---|
Iterator<TokenStream> |
GraphTokenStreamFiniteStrings.getFiniteStrings()
Get all finite strings from the automaton.
|
Iterator<TokenStream> |
GraphTokenStreamFiniteStrings.getFiniteStrings(int startState,
int endState)
Get all finite strings that start at
startState and end at endState . |
Constructor and Description |
---|
GraphTokenStreamFiniteStrings(TokenStream in) |
Copyright © 2000-2018 Apache Software Foundation. All Rights Reserved.