Lucene 4.0.0 analyzers-common API

Analyzers for indexing content in different languages and domains.

See: Description

Packages 
Package Description
org.apache.lucene.analysis.ar
Analyzer for Arabic.
org.apache.lucene.analysis.bg
Analyzer for Bulgarian.
org.apache.lucene.analysis.br
Analyzer for Brazilian Portuguese.
org.apache.lucene.analysis.ca
Analyzer for Catalan.
org.apache.lucene.analysis.charfilter
Normalization of text before the tokenizer.
org.apache.lucene.analysis.cjk
Analyzer for Chinese, Japanese, and Korean, which indexes bigrams.
org.apache.lucene.analysis.cn
Analyzer for Chinese, which indexes unigrams (individual chinese characters).
org.apache.lucene.analysis.commongrams
Construct n-grams for frequently occurring terms and phrases.
org.apache.lucene.analysis.compound
A filter that decomposes compound words you find in many Germanic languages into the word parts.
org.apache.lucene.analysis.compound.hyphenation
The code for the compound word hyphenation is taken from the Apache FOP project.
org.apache.lucene.analysis.core
Basic, general-purpose analysis components.
org.apache.lucene.analysis.cz
Analyzer for Czech.
org.apache.lucene.analysis.da
Analyzer for Danish.
org.apache.lucene.analysis.de
Analyzer for German.
org.apache.lucene.analysis.el
Analyzer for Greek.
org.apache.lucene.analysis.en
Analyzer for English.
org.apache.lucene.analysis.es
Analyzer for Spanish.
org.apache.lucene.analysis.eu
Analyzer for Basque.
org.apache.lucene.analysis.fa
Analyzer for Persian.
org.apache.lucene.analysis.fi
Analyzer for Finnish.
org.apache.lucene.analysis.fr
Analyzer for French.
org.apache.lucene.analysis.ga
Analysis for Irish.
org.apache.lucene.analysis.gl
Analyzer for Galician.
org.apache.lucene.analysis.hi
Analyzer for Hindi.
org.apache.lucene.analysis.hu
Analyzer for Hungarian.
org.apache.lucene.analysis.hunspell
Stemming TokenFilter using a Java implementation of the Hunspell stemming algorithm.
org.apache.lucene.analysis.hy
Analyzer for Armenian.
org.apache.lucene.analysis.id
Analyzer for Indonesian.
org.apache.lucene.analysis.in
Analysis components for Indian languages.
org.apache.lucene.analysis.it
Analyzer for Italian.
org.apache.lucene.analysis.lv
Analyzer for Latvian.
org.apache.lucene.analysis.miscellaneous
Miscellaneous TokenStreams
org.apache.lucene.analysis.ngram
Character n-gram tokenizers and filters.
org.apache.lucene.analysis.nl
Analyzer for Dutch.
org.apache.lucene.analysis.no
Analyzer for Norwegian.
org.apache.lucene.analysis.path
Analysis components for path-like strings such as filenames.
org.apache.lucene.analysis.pattern
Set of components for pattern-based (regex) analysis.
org.apache.lucene.analysis.payloads
Provides various convenience classes for creating payloads on Tokens.
org.apache.lucene.analysis.position
Filter for assigning position increments.
org.apache.lucene.analysis.pt
Analyzer for Portuguese.
org.apache.lucene.analysis.query
Automatically filter high-frequency stopwords.
org.apache.lucene.analysis.reverse
Filter to reverse token text.
org.apache.lucene.analysis.ro
Analyzer for Romanian.
org.apache.lucene.analysis.ru
Analyzer for Russian.
org.apache.lucene.analysis.shingle
Word n-gram filters
org.apache.lucene.analysis.sinks
Implementations of the SinkTokenizer that might be useful.
org.apache.lucene.analysis.snowball
TokenFilter and Analyzer implementations that use Snowball stemmers.
org.apache.lucene.analysis.standard
Fast, general-purpose grammar-based tokenizers.
org.apache.lucene.analysis.standard.std31
Backwards-compatible implementation to match Version.LUCENE_31
org.apache.lucene.analysis.standard.std34
Backwards-compatible implementation to match Version.LUCENE_34
org.apache.lucene.analysis.standard.std36
Backwards-compatible implementation to match Version.LUCENE_36
org.apache.lucene.analysis.sv
Analyzer for Swedish.
org.apache.lucene.analysis.synonym
Analysis components for Synonyms.
org.apache.lucene.analysis.th
Analyzer for Thai.
org.apache.lucene.analysis.tr
Analyzer for Turkish.
org.apache.lucene.analysis.util
Utility functions for text analysis.
org.apache.lucene.analysis.wikipedia
Tokenizer that is aware of Wikipedia syntax.
org.apache.lucene.collation
Unicode collation support.
org.apache.lucene.collation.tokenattributes
Custom AttributeImpl for indexing collation keys as index terms.
org.tartarus.snowball
Snowball stemmer API.
org.tartarus.snowball.ext
Autogenerated snowball stemmer implementations.

Copyright © 2000-2012 Apache Software Foundation. All Rights Reserved.