|
||||||||||
PREV NEXT | FRAMES NO FRAMES |
Packages that use TokenFilter | |
---|---|
org.apache.lucene.analysis | API and code to convert text into indexable/searchable tokens. |
org.apache.lucene.analysis.ar | Analyzer for Arabic. |
org.apache.lucene.analysis.br | Analyzer for Brazilian Portuguese. |
org.apache.lucene.analysis.cn | Analyzer for Chinese, which indexes unigrams (individual chinese characters). |
org.apache.lucene.analysis.cn.smart |
Analyzer for Simplified Chinese, which indexes words. |
org.apache.lucene.analysis.compound | A filter that decomposes compound words you find in many Germanic languages into the word parts. |
org.apache.lucene.analysis.de | Analyzer for German. |
org.apache.lucene.analysis.el | Analyzer for Greek. |
org.apache.lucene.analysis.fa | Analyzer for Persian. |
org.apache.lucene.analysis.fr | Analyzer for French. |
org.apache.lucene.analysis.ngram | Character n-gram tokenizers and filters. |
org.apache.lucene.analysis.nl | Analyzer for Dutch. |
org.apache.lucene.analysis.payloads | Provides various convenience classes for creating payloads on Tokens. |
org.apache.lucene.analysis.position | Filter for assigning position increments. |
org.apache.lucene.analysis.reverse | Filter to reverse token text. |
org.apache.lucene.analysis.ru | Analyzer for Russian. |
org.apache.lucene.analysis.shingle | Word n-gram filters |
org.apache.lucene.analysis.snowball | TokenFilter and Analyzer implementations that use Snowball
stemmers. |
org.apache.lucene.analysis.standard | A fast grammar-based tokenizer constructed with JFlex. |
org.apache.lucene.analysis.th | Analyzer for Thai. |
org.apache.lucene.collation |
CollationKeyFilter and ICUCollationKeyFilter
convert each token into its binary CollationKey using the
provided Collator , and then encode the CollationKey
as a String using
IndexableBinaryStringTools , to allow it to be
stored as an index term. |
org.apache.lucene.wordnet | This package uses synonyms defined by WordNet. |
Uses of TokenFilter in org.apache.lucene.analysis |
---|
Subclasses of TokenFilter in org.apache.lucene.analysis | |
---|---|
class |
ASCIIFoldingFilter
This class converts alphabetic, numeric, and symbolic Unicode characters which are not in the first 127 ASCII characters (the "Basic Latin" Unicode block) into their ASCII equivalents, if one exists. |
class |
CachingTokenFilter
This class can be used if the token attributes of a TokenStream are intended to be consumed more than once. |
class |
ISOLatin1AccentFilter
Deprecated. If you build a new index, use ASCIIFoldingFilter
which covers a superset of Latin 1.
This class is included for use with existing
indexes and will be removed in a future release (possibly Lucene 4.0). |
class |
LengthFilter
Removes words that are too long or too short from the stream. |
class |
LowerCaseFilter
Normalizes token text to lower case. |
class |
PorterStemFilter
Transforms the token stream as per the Porter stemming algorithm. |
class |
StopFilter
Removes stop words from a token stream. |
class |
TeeSinkTokenFilter
This TokenFilter provides the ability to set aside attribute states that have already been analyzed. |
Uses of TokenFilter in org.apache.lucene.analysis.ar |
---|
Subclasses of TokenFilter in org.apache.lucene.analysis.ar | |
---|---|
class |
ArabicNormalizationFilter
A TokenFilter that applies ArabicNormalizer to normalize the orthography. |
class |
ArabicStemFilter
A TokenFilter that applies ArabicStemmer to stem Arabic words.. |
Uses of TokenFilter in org.apache.lucene.analysis.br |
---|
Subclasses of TokenFilter in org.apache.lucene.analysis.br | |
---|---|
class |
BrazilianStemFilter
A TokenFilter that applies BrazilianStemmer . |
Uses of TokenFilter in org.apache.lucene.analysis.cn |
---|
Subclasses of TokenFilter in org.apache.lucene.analysis.cn | |
---|---|
class |
ChineseFilter
A TokenFilter with a stop word table. |
Uses of TokenFilter in org.apache.lucene.analysis.cn.smart |
---|
Subclasses of TokenFilter in org.apache.lucene.analysis.cn.smart | |
---|---|
class |
WordTokenFilter
A TokenFilter that breaks sentences into words. |
Uses of TokenFilter in org.apache.lucene.analysis.compound |
---|
Subclasses of TokenFilter in org.apache.lucene.analysis.compound | |
---|---|
class |
CompoundWordTokenFilterBase
Base class for decomposition token filters. |
class |
DictionaryCompoundWordTokenFilter
A TokenFilter that decomposes compound words found in many Germanic languages. |
class |
HyphenationCompoundWordTokenFilter
A TokenFilter that decomposes compound words found in many Germanic languages. |
Uses of TokenFilter in org.apache.lucene.analysis.de |
---|
Subclasses of TokenFilter in org.apache.lucene.analysis.de | |
---|---|
class |
GermanStemFilter
A TokenFilter that stems German words. |
Uses of TokenFilter in org.apache.lucene.analysis.el |
---|
Subclasses of TokenFilter in org.apache.lucene.analysis.el | |
---|---|
class |
GreekLowerCaseFilter
Normalizes token text to lower case, removes some Greek diacritics, and standardizes final sigma to sigma. |
Uses of TokenFilter in org.apache.lucene.analysis.fa |
---|
Subclasses of TokenFilter in org.apache.lucene.analysis.fa | |
---|---|
class |
PersianNormalizationFilter
A TokenFilter that applies PersianNormalizer to normalize the
orthography. |
Uses of TokenFilter in org.apache.lucene.analysis.fr |
---|
Subclasses of TokenFilter in org.apache.lucene.analysis.fr | |
---|---|
class |
ElisionFilter
Removes elisions from a TokenStream . |
class |
FrenchStemFilter
A TokenFilter that stems french words. |
Uses of TokenFilter in org.apache.lucene.analysis.ngram |
---|
Subclasses of TokenFilter in org.apache.lucene.analysis.ngram | |
---|---|
class |
EdgeNGramTokenFilter
Tokenizes the given token into n-grams of given size(s). |
class |
NGramTokenFilter
Tokenizes the input into n-grams of the given size(s). |
Uses of TokenFilter in org.apache.lucene.analysis.nl |
---|
Subclasses of TokenFilter in org.apache.lucene.analysis.nl | |
---|---|
class |
DutchStemFilter
A TokenFilter that stems Dutch words. |
Uses of TokenFilter in org.apache.lucene.analysis.payloads |
---|
Subclasses of TokenFilter in org.apache.lucene.analysis.payloads | |
---|---|
class |
DelimitedPayloadTokenFilter
Characters before the delimiter are the "token", those after are the payload. |
class |
NumericPayloadTokenFilter
Assigns a payload to a token based on the Token.type() |
class |
TokenOffsetPayloadTokenFilter
Adds the Token.setStartOffset(int)
and Token.setEndOffset(int)
First 4 bytes are the start |
class |
TypeAsPayloadTokenFilter
Makes the Token.type() a payload. |
Uses of TokenFilter in org.apache.lucene.analysis.position |
---|
Subclasses of TokenFilter in org.apache.lucene.analysis.position | |
---|---|
class |
PositionFilter
Set the positionIncrement of all tokens to the "positionIncrement", except the first return token which retains its original positionIncrement value. |
Uses of TokenFilter in org.apache.lucene.analysis.reverse |
---|
Subclasses of TokenFilter in org.apache.lucene.analysis.reverse | |
---|---|
class |
ReverseStringFilter
Reverse token string, for example "country" => "yrtnuoc". |
Uses of TokenFilter in org.apache.lucene.analysis.ru |
---|
Subclasses of TokenFilter in org.apache.lucene.analysis.ru | |
---|---|
class |
RussianLowerCaseFilter
Deprecated. Use LowerCaseFilter instead, which has the same
functionality. This filter will be removed in Lucene 4.0 |
class |
RussianStemFilter
A TokenFilter that stems Russian words. |
Uses of TokenFilter in org.apache.lucene.analysis.shingle |
---|
Subclasses of TokenFilter in org.apache.lucene.analysis.shingle | |
---|---|
class |
ShingleFilter
A ShingleFilter constructs shingles (token n-grams) from a token stream. |
Uses of TokenFilter in org.apache.lucene.analysis.snowball |
---|
Subclasses of TokenFilter in org.apache.lucene.analysis.snowball | |
---|---|
class |
SnowballFilter
A filter that stems words using a Snowball-generated stemmer. |
Uses of TokenFilter in org.apache.lucene.analysis.standard |
---|
Subclasses of TokenFilter in org.apache.lucene.analysis.standard | |
---|---|
class |
StandardFilter
Normalizes tokens extracted with StandardTokenizer . |
Uses of TokenFilter in org.apache.lucene.analysis.th |
---|
Subclasses of TokenFilter in org.apache.lucene.analysis.th | |
---|---|
class |
ThaiWordFilter
TokenFilter that use BreakIterator to break each
Token that is Thai into separate Token(s) for each Thai word. |
Uses of TokenFilter in org.apache.lucene.collation |
---|
Subclasses of TokenFilter in org.apache.lucene.collation | |
---|---|
class |
CollationKeyFilter
Converts each token into its CollationKey , and then
encodes the CollationKey with IndexableBinaryStringTools , to allow
it to be stored as an index term. |
class |
ICUCollationKeyFilter
Converts each token into its CollationKey , and
then encodes the CollationKey with IndexableBinaryStringTools , to
allow it to be stored as an index term. |
Uses of TokenFilter in org.apache.lucene.wordnet |
---|
Subclasses of TokenFilter in org.apache.lucene.wordnet | |
---|---|
class |
SynonymTokenFilter
Injects additional tokens for synonyms of token terms fetched from the underlying child stream; the child stream must deliver lowercase tokens for synonyms to be found. |
|
||||||||||
PREV NEXT | FRAMES NO FRAMES |