Package | Description |
---|---|
org.apache.lucene.analysis.charfilter |
Normalization of text before the tokenizer.
|
org.apache.lucene.analysis.cjk |
Analyzer for Chinese, Japanese, and Korean, which indexes bigrams.
|
org.apache.lucene.analysis.pattern |
Set of components for pattern-based (regex) analysis.
|
Modifier and Type | Class and Description |
---|---|
class |
HTMLStripCharFilter
A CharFilter that wraps another Reader and attempts to strip out HTML constructs.
|
class |
MappingCharFilter
Simplistic
CharFilter that applies the mappings
contained in a NormalizeCharMap to the character
stream, and correcting the resulting changes to the
offsets. |
Modifier and Type | Class and Description |
---|---|
class |
CJKWidthCharFilter
A
CharFilter that normalizes CJK width differences:
Folds fullwidth ASCII variants into the equivalent basic latin
Folds halfwidth Katakana variants into the equivalent kana
|
Modifier and Type | Class and Description |
---|---|
class |
PatternReplaceCharFilter
CharFilter that uses a regular expression for the target of replace string.
|
Copyright © 2000-2024 Apache Software Foundation. All Rights Reserved.