Package | Description |
---|---|
org.apache.lucene.analysis |
Text analysis.
|
org.apache.lucene.analysis.standard |
Fast, general-purpose grammar-based tokenizer
StandardTokenizer
implements the Word Break rules from the Unicode Text Segmentation algorithm, as specified in
Unicode Standard Annex #29. |
Modifier and Type | Class and Description |
---|---|
class |
CachingTokenFilter
This class can be used if the token attributes of a TokenStream
are intended to be consumed more than once.
|
class |
FilteringTokenFilter
Abstract base class for TokenFilters that may remove tokens.
|
class |
LowerCaseFilter
Normalizes token text to lower case.
|
class |
StopFilter
Removes stop words from a token stream.
|
Modifier and Type | Class and Description |
---|---|
class |
StandardFilter
Normalizes tokens extracted with
StandardTokenizer . |
Copyright © 2000-2016 Apache Software Foundation. All Rights Reserved.