| Package | Description |
|---|---|
| org.apache.lucene.analysis |
Text analysis.
|
| org.apache.lucene.analysis.standard |
Fast, general-purpose grammar-based tokenizer
StandardTokenizer
implements the Word Break rules from the Unicode Text Segmentation algorithm, as specified in
Unicode Standard Annex #29. |
| Modifier and Type | Class and Description |
|---|---|
class |
CachingTokenFilter
This class can be used if the token attributes of a TokenStream
are intended to be consumed more than once.
|
class |
FilteringTokenFilter
Abstract base class for TokenFilters that may remove tokens.
|
class |
LowerCaseFilter
Normalizes token text to lower case.
|
class |
StopFilter
Removes stop words from a token stream.
|
| Modifier and Type | Class and Description |
|---|---|
class |
StandardFilter
Normalizes tokens extracted with
StandardTokenizer. |
Copyright © 2000-2017 Apache Software Foundation. All Rights Reserved.