|
|||||||||
PREV NEXT | FRAMES NO FRAMES |
See:
Description
Packages | |
---|---|
org.apache.lucene.analysis.ar | Analyzer for Arabic. |
org.apache.lucene.analysis.bg | Analyzer for Bulgarian. |
org.apache.lucene.analysis.br | Analyzer for Brazilian Portuguese. |
org.apache.lucene.analysis.ca | Analyzer for Catalan. |
org.apache.lucene.analysis.charfilter | Normalization of text before the tokenizer. |
org.apache.lucene.analysis.cjk | Analyzer for Chinese, Japanese, and Korean, which indexes bigrams. |
org.apache.lucene.analysis.cn | Analyzer for Chinese, which indexes unigrams (individual chinese characters). |
org.apache.lucene.analysis.commongrams | Construct n-grams for frequently occurring terms and phrases. |
org.apache.lucene.analysis.compound | A filter that decomposes compound words you find in many Germanic languages into the word parts. |
org.apache.lucene.analysis.compound.hyphenation | The code for the compound word hyphenation is taken from the Apache FOP project. |
org.apache.lucene.analysis.core | Basic, general-purpose analysis components. |
org.apache.lucene.analysis.cz | Analyzer for Czech. |
org.apache.lucene.analysis.da | Analyzer for Danish. |
org.apache.lucene.analysis.de | Analyzer for German. |
org.apache.lucene.analysis.el | Analyzer for Greek. |
org.apache.lucene.analysis.en | Analyzer for English. |
org.apache.lucene.analysis.es | Analyzer for Spanish. |
org.apache.lucene.analysis.eu | Analyzer for Basque. |
org.apache.lucene.analysis.fa | Analyzer for Persian. |
org.apache.lucene.analysis.fi | Analyzer for Finnish. |
org.apache.lucene.analysis.fr | Analyzer for French. |
org.apache.lucene.analysis.ga | Analysis for Irish. |
org.apache.lucene.analysis.gl | Analyzer for Galician. |
org.apache.lucene.analysis.hi | Analyzer for Hindi. |
org.apache.lucene.analysis.hu | Analyzer for Hungarian. |
org.apache.lucene.analysis.hunspell | Stemming TokenFilter using a Java implementation of the Hunspell stemming algorithm. |
org.apache.lucene.analysis.hy | Analyzer for Armenian. |
org.apache.lucene.analysis.id | Analyzer for Indonesian. |
org.apache.lucene.analysis.in | Analysis components for Indian languages. |
org.apache.lucene.analysis.it | Analyzer for Italian. |
org.apache.lucene.analysis.lv | Analyzer for Latvian. |
org.apache.lucene.analysis.miscellaneous | Miscellaneous TokenStreams |
org.apache.lucene.analysis.ngram | Character n-gram tokenizers and filters. |
org.apache.lucene.analysis.nl | Analyzer for Dutch. |
org.apache.lucene.analysis.no | Analyzer for Norwegian. |
org.apache.lucene.analysis.path | Analysis components for path-like strings such as filenames. |
org.apache.lucene.analysis.pattern | Set of components for pattern-based (regex) analysis. |
org.apache.lucene.analysis.payloads | Provides various convenience classes for creating payloads on Tokens. |
org.apache.lucene.analysis.position | Filter for assigning position increments. |
org.apache.lucene.analysis.pt | Analyzer for Portuguese. |
org.apache.lucene.analysis.query | Automatically filter high-frequency stopwords. |
org.apache.lucene.analysis.reverse | Filter to reverse token text. |
org.apache.lucene.analysis.ro | Analyzer for Romanian. |
org.apache.lucene.analysis.ru | Analyzer for Russian. |
org.apache.lucene.analysis.shingle | Word n-gram filters |
org.apache.lucene.analysis.sinks | TeeSinkTokenFilter and implementations
of TeeSinkTokenFilter.SinkFilter that
might be useful. |
org.apache.lucene.analysis.snowball | TokenFilter and Analyzer implementations that use Snowball
stemmers. |
org.apache.lucene.analysis.standard | Fast, general-purpose grammar-based tokenizers. |
org.apache.lucene.analysis.standard.std31 | Backwards-compatible implementation to match Version.LUCENE_31 |
org.apache.lucene.analysis.standard.std34 | Backwards-compatible implementation to match Version.LUCENE_34 |
org.apache.lucene.analysis.standard.std36 | Backwards-compatible implementation to match Version.LUCENE_36 |
org.apache.lucene.analysis.sv | Analyzer for Swedish. |
org.apache.lucene.analysis.synonym | Analysis components for Synonyms. |
org.apache.lucene.analysis.th | Analyzer for Thai. |
org.apache.lucene.analysis.tr | Analyzer for Turkish. |
org.apache.lucene.analysis.util | Utility functions for text analysis. |
org.apache.lucene.analysis.wikipedia | Tokenizer that is aware of Wikipedia syntax. |
org.apache.lucene.collation | Unicode collation support. |
org.apache.lucene.collation.tokenattributes | Custom AttributeImpl for indexing collation keys as index terms. |
org.tartarus.snowball | Snowball stemmer API. |
org.tartarus.snowball.ext | Autogenerated snowball stemmer implementations. |
Analyzers for indexing content in different languages and domains.
For an introduction to Lucene's analysis API, see the org.apache.lucene.analysis
package documentation.
This module contains concrete components (CharFilter
s,
Tokenizer
s, and (TokenFilter
s) for
analyzing different types of content. It also provides a number of Analyzer
s
for different languages that you can use to get started quickly.
|
|||||||||
PREV NEXT | FRAMES NO FRAMES |