Lucene 3.3.0 contrib-analyzers API

Packages Analyzer for Arabic. Analyzer for Bulgarian. Analyzer for Brazilian Portuguese. Analyzer for Catalan.
org.apache.lucene.analysis.cjk Analyzer for Chinese, Japanese, and Korean, which indexes bigrams (overlapping groups of two adjacent Han characters). Analyzer for Chinese, which indexes unigrams (individual chinese characters).
org.apache.lucene.analysis.compound A filter that decomposes compound words you find in many Germanic languages into the word parts.
org.apache.lucene.analysis.compound.hyphenation The code for the compound word hyphenation is taken from the Apache FOP project. Analyzer for Czech.
org.apache.lucene.analysis.da Analyzer for Danish. Analyzer for German.
org.apache.lucene.analysis.el Analyzer for Greek.
org.apache.lucene.analysis.en Analyzer for English. Analyzer for Spanish. Analyzer for Basque.
org.apache.lucene.analysis.fa Analyzer for Persian. Analyzer for Finnish. Analyzer for French. Analyzer for Galician.
org.apache.lucene.analysis.hi Analyzer for Hindi. Analyzer for Hungarian.
org.apache.lucene.analysis.hy Analyzer for Armenian. Analyzer for Indonesian. Analysis components for Indian languages. Analyzer for Italian. Analyzer for Latvian.
org.apache.lucene.analysis.miscellaneous Miscellaneous TokenStreams
org.apache.lucene.analysis.ngram Character n-gram tokenizers and filters. Analyzer for Dutch. Analyzer for Norwegian.
Provides various convenience classes for creating payloads on Tokens.
org.apache.lucene.analysis.position Filter for assigning position increments. Analyzer for Portuguese.
org.apache.lucene.analysis.query Automatically filter high-frequency stopwords.
org.apache.lucene.analysis.reverse Filter to reverse token text. Analyzer for Romanian. Analyzer for Russian.
org.apache.lucene.analysis.shingle Word n-gram filters
Implementations of the SinkTokenizer that might be useful.
org.apache.lucene.analysis.snowball TokenFilter and Analyzer implementations that use Snowball stemmers. Analyzer for Swedish. Analyzer for Thai. Analyzer for Turkish.
org.apache.lucene.analysis.wikipedia Tokenizer that is aware of Wikipedia syntax.


Copyright © 2000-2011 Apache Software Foundation. All Rights Reserved.