Lucene 3.0.3 contrib-analyzers API

Packages Analyzer for Arabic. Analyzer for Brazilian Portuguese.
org.apache.lucene.analysis.cjk Analyzer for Chinese, Japanese, and Korean, which indexes bigrams (overlapping groups of two adjacent Han characters). Analyzer for Chinese, which indexes unigrams (individual chinese characters).
org.apache.lucene.analysis.compound A filter that decomposes compound words you find in many Germanic languages into the word parts.
org.apache.lucene.analysis.compound.hyphenation The code for the compound word hyphenation is taken from the Apache FOP project. Analyzer for Czech. Analyzer for German.
org.apache.lucene.analysis.el Analyzer for Greek.
org.apache.lucene.analysis.fa Analyzer for Persian. Analyzer for French.
org.apache.lucene.analysis.miscellaneous Miscellaneous TokenStreams
org.apache.lucene.analysis.ngram Character n-gram tokenizers and filters. Analyzer for Dutch.
Provides various convenience classes for creating payloads on Tokens.
org.apache.lucene.analysis.position Filter for assigning position increments.
org.apache.lucene.analysis.query Automatically filter high-frequency stopwords.
org.apache.lucene.analysis.reverse Filter to reverse token text. Analyzer for Russian.
org.apache.lucene.analysis.shingle Word n-gram filters
Implementations of the SinkTokenizer that might be useful. Analyzer for Thai.


Copyright © 2000-2010 Apache Software Foundation. All Rights Reserved.