Uses of Class
org.apache.lucene.analysis.TokenStream

Packages that use TokenStream
org.apache.lucene.analysis API and code to convert text into indexable/searchable tokens. 
org.apache.lucene.analysis.standard The org.apache.lucene.analysis.standard package contains three fast grammar-based tokenizers constructed with JFlex: 
org.apache.lucene.collation CollationKeyFilter converts each token into its binary CollationKey using the provided Collator, and then encode the CollationKey as a String using IndexableBinaryStringTools, to allow it to be stored as an index term. 
org.apache.lucene.document The logical representation of a Document for indexing and searching. 
 

Uses of TokenStream in org.apache.lucene.analysis
 

Subclasses of TokenStream in org.apache.lucene.analysis
 class ASCIIFoldingFilter
          This class converts alphabetic, numeric, and symbolic Unicode characters which are not in the first 127 ASCII characters (the "Basic Latin" Unicode block) into their ASCII equivalents, if one exists.
 class CachingTokenFilter
          This class can be used if the token attributes of a TokenStream are intended to be consumed more than once.
 class CharTokenizer
          An abstract base class for simple, character-oriented tokenizers.
 class FilteringTokenFilter
          Abstract base class for TokenFilters that may remove tokens.
 class ISOLatin1AccentFilter
          Deprecated. If you build a new index, use ASCIIFoldingFilter which covers a superset of Latin 1. This class is included for use with existing indexes and will be removed in a future release (possibly Lucene 4.0).
 class KeywordMarkerFilter
          Marks terms as keywords via the KeywordAttribute.
 class KeywordTokenizer
          Emits the entire input as a single token.
 class LengthFilter
          Removes words that are too long or too short from the stream.
 class LetterTokenizer
          A LetterTokenizer is a tokenizer that divides text at non-letters.
 class LimitTokenCountFilter
          This TokenFilter limits the number of tokens while indexing.
 class LowerCaseFilter
          Normalizes token text to lower case.
 class LowerCaseTokenizer
          LowerCaseTokenizer performs the function of LetterTokenizer and LowerCaseFilter together.
 class NumericTokenStream
          Expert: This class provides a TokenStream for indexing numeric values that can be used by NumericRangeQuery or NumericRangeFilter.
 class PorterStemFilter
          Transforms the token stream as per the Porter stemming algorithm.
 class StopFilter
          Removes stop words from a token stream.
 class TeeSinkTokenFilter
          This TokenFilter provides the ability to set aside attribute states that have already been analyzed.
static class TeeSinkTokenFilter.SinkTokenStream
           
 class TokenFilter
          A TokenFilter is a TokenStream whose input is another TokenStream.
 class Tokenizer
          A Tokenizer is a TokenStream whose input is a Reader.
 class WhitespaceTokenizer
          A WhitespaceTokenizer is a tokenizer that divides text at whitespace.
 

Fields in org.apache.lucene.analysis declared as TokenStream
protected  TokenStream TokenFilter.input
          The source of tokens for this filter.
protected  TokenStream ReusableAnalyzerBase.TokenStreamComponents.sink
           
 

Methods in org.apache.lucene.analysis that return TokenStream
protected  TokenStream ReusableAnalyzerBase.TokenStreamComponents.getTokenStream()
          Returns the sink TokenStream
 TokenStream ReusableAnalyzerBase.reusableTokenStream(String fieldName, Reader reader)
          This method uses ReusableAnalyzerBase.createComponents(String, Reader) to obtain an instance of ReusableAnalyzerBase.TokenStreamComponents.
 TokenStream PerFieldAnalyzerWrapper.reusableTokenStream(String fieldName, Reader reader)
           
 TokenStream LimitTokenCountAnalyzer.reusableTokenStream(String fieldName, Reader reader)
           
 TokenStream Analyzer.reusableTokenStream(String fieldName, Reader reader)
          Creates a TokenStream that is allowed to be re-used from the previous time that the same thread called this method.
 TokenStream ReusableAnalyzerBase.tokenStream(String fieldName, Reader reader)
          This method uses ReusableAnalyzerBase.createComponents(String, Reader) to obtain an instance of ReusableAnalyzerBase.TokenStreamComponents and returns the sink of the components.
 TokenStream PerFieldAnalyzerWrapper.tokenStream(String fieldName, Reader reader)
           
 TokenStream LimitTokenCountAnalyzer.tokenStream(String fieldName, Reader reader)
           
abstract  TokenStream Analyzer.tokenStream(String fieldName, Reader reader)
          Creates a TokenStream which tokenizes all the text in the provided Reader.
 

Constructors in org.apache.lucene.analysis with parameters of type TokenStream
ASCIIFoldingFilter(TokenStream input)
           
CachingTokenFilter(TokenStream input)
           
FilteringTokenFilter(boolean enablePositionIncrements, TokenStream input)
           
ISOLatin1AccentFilter(TokenStream input)
          Deprecated.  
KeywordMarkerFilter(TokenStream in, CharArraySet keywordSet)
          Create a new KeywordMarkerFilter, that marks the current token as a keyword if the tokens term buffer is contained in the given set via the KeywordAttribute.
KeywordMarkerFilter(TokenStream in, Set<?> keywordSet)
          Create a new KeywordMarkerFilter, that marks the current token as a keyword if the tokens term buffer is contained in the given set via the KeywordAttribute.
LengthFilter(boolean enablePositionIncrements, TokenStream in, int min, int max)
          Build a filter that removes words that are too long or too short from the text.
LengthFilter(TokenStream in, int min, int max)
          Deprecated. Use LengthFilter.LengthFilter(boolean, TokenStream, int, int) instead.
LimitTokenCountFilter(TokenStream in, int maxTokenCount)
          Build a filter that only accepts tokens up to a maximum number.
LowerCaseFilter(TokenStream in)
          Deprecated. Use LowerCaseFilter.LowerCaseFilter(Version, TokenStream) instead.
LowerCaseFilter(Version matchVersion, TokenStream in)
          Create a new LowerCaseFilter, that normalizes token text to lower case.
PorterStemFilter(TokenStream in)
           
ReusableAnalyzerBase.TokenStreamComponents(Tokenizer source, TokenStream result)
          Creates a new ReusableAnalyzerBase.TokenStreamComponents instance.
StopFilter(boolean enablePositionIncrements, TokenStream in, Set<?> stopWords)
          Deprecated. use StopFilter.StopFilter(Version, TokenStream, Set) instead
StopFilter(boolean enablePositionIncrements, TokenStream input, Set<?> stopWords, boolean ignoreCase)
          Deprecated. use StopFilter.StopFilter(Version, TokenStream, Set, boolean) instead
StopFilter(Version matchVersion, TokenStream in, Set<?> stopWords)
          Constructs a filter which removes words from the input TokenStream that are named in the Set.
StopFilter(Version matchVersion, TokenStream input, Set<?> stopWords, boolean ignoreCase)
          Construct a token stream filtering the given input.
TeeSinkTokenFilter(TokenStream input)
          Instantiates a new TeeSinkTokenFilter.
TokenFilter(TokenStream input)
          Construct a token stream filtering the given input.
 

Uses of TokenStream in org.apache.lucene.analysis.standard
 

Subclasses of TokenStream in org.apache.lucene.analysis.standard
 class ClassicFilter
          Normalizes tokens extracted with ClassicTokenizer.
 class ClassicTokenizer
          A grammar-based tokenizer constructed with JFlex
 class StandardFilter
          Normalizes tokens extracted with StandardTokenizer.
 class StandardTokenizer
          A grammar-based tokenizer constructed with JFlex.
 class UAX29URLEmailTokenizer
          This class implements Word Break rules from the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29 URLs and email addresses are also tokenized according to the relevant RFCs.
 

Constructors in org.apache.lucene.analysis.standard with parameters of type TokenStream
ClassicFilter(TokenStream in)
          Construct filtering in.
StandardFilter(TokenStream in)
          Deprecated. Use StandardFilter.StandardFilter(Version, TokenStream) instead.
StandardFilter(Version matchVersion, TokenStream in)
           
 

Uses of TokenStream in org.apache.lucene.collation
 

Subclasses of TokenStream in org.apache.lucene.collation
 class CollationKeyFilter
           Converts each token into its CollationKey, and then encodes the CollationKey with IndexableBinaryStringTools, to allow it to be stored as an index term.
 

Methods in org.apache.lucene.collation that return TokenStream
 TokenStream CollationKeyAnalyzer.reusableTokenStream(String fieldName, Reader reader)
           
 TokenStream CollationKeyAnalyzer.tokenStream(String fieldName, Reader reader)
           
 

Constructors in org.apache.lucene.collation with parameters of type TokenStream
CollationKeyFilter(TokenStream input, Collator collator)
           
 

Uses of TokenStream in org.apache.lucene.document
 

Fields in org.apache.lucene.document declared as TokenStream
protected  TokenStream AbstractField.tokenStream
           
 

Methods in org.apache.lucene.document that return TokenStream
 TokenStream NumericField.tokenStreamValue()
          Returns a NumericTokenStream for indexing the numeric value.
 TokenStream Fieldable.tokenStreamValue()
          The TokenStream for this field to be used when indexing, or null.
 TokenStream Field.tokenStreamValue()
          The TokesStream for this field to be used when indexing, or null.
 

Methods in org.apache.lucene.document with parameters of type TokenStream
 void Field.setTokenStream(TokenStream tokenStream)
          Expert: sets the token stream to be used for indexing and causes isIndexed() and isTokenized() to return true.
 

Constructors in org.apache.lucene.document with parameters of type TokenStream
Field(String name, TokenStream tokenStream)
          Create a tokenized and indexed field that is not stored.
Field(String name, TokenStream tokenStream, Field.TermVector termVector)
          Create a tokenized and indexed field that is not stored, optionally with storing term vectors.
 



Copyright © 2000-2011 Apache Software Foundation. All Rights Reserved.