Overview (Lucene 3.0.3 API)

Apache Lucene is a high-performance, full-featured text search engine library. Here's a simple example how to use Lucene for indexing and searching (using JUnit to check if the results are what we expect):

Core
org.apache.lucene	Top-level package.
org.apache.lucene.analysis	API and code to convert text into indexable/searchable tokens.
org.apache.lucene.analysis.standard	A fast grammar-based tokenizer constructed with JFlex.
org.apache.lucene.analysis.tokenattributes
org.apache.lucene.document	The logical representation of a `Document` for indexing and searching.
org.apache.lucene.index	Code to maintain and access indices.
org.apache.lucene.messages	For Native Language Support (NLS), system of software internationalization.
org.apache.lucene.queryParser	A simple query parser implemented with JavaCC.
org.apache.lucene.search	Code to search indices.
org.apache.lucene.search.function	Programmatic control over documents scores.
org.apache.lucene.search.payloads	The payloads package provides Query mechanisms for finding and using payloads.
org.apache.lucene.search.spans	The calculus of spans.
org.apache.lucene.store	Binary i/o API, used for all index data.
org.apache.lucene.util	Some utility classes.
org.apache.lucene.util.cache

Demo
org.apache.lucene.demo
org.apache.lucene.demo.html

contrib: Analysis
org.apache.lucene.analysis.ar	Analyzer for Arabic.
org.apache.lucene.analysis.br	Analyzer for Brazilian Portuguese.
org.apache.lucene.analysis.cjk	Analyzer for Chinese, Japanese, and Korean, which indexes bigrams (overlapping groups of two adjacent Han characters).
org.apache.lucene.analysis.cn	Analyzer for Chinese, which indexes unigrams (individual chinese characters).
org.apache.lucene.analysis.cn.smart	Analyzer for Simplified Chinese, which indexes words.
org.apache.lucene.analysis.cn.smart.hhmm	SmartChineseAnalyzer Hidden Markov Model package.
org.apache.lucene.analysis.compound	A filter that decomposes compound words you find in many Germanic languages into the word parts.
org.apache.lucene.analysis.compound.hyphenation	The code for the compound word hyphenation is taken from the Apache FOP project.
org.apache.lucene.analysis.cz	Analyzer for Czech.
org.apache.lucene.analysis.de	Analyzer for German.
org.apache.lucene.analysis.el	Analyzer for Greek.
org.apache.lucene.analysis.fa	Analyzer for Persian.
org.apache.lucene.analysis.fr	Analyzer for French.
org.apache.lucene.analysis.miscellaneous	Miscellaneous TokenStreams
org.apache.lucene.analysis.ngram	Character n-gram tokenizers and filters.
org.apache.lucene.analysis.nl	Analyzer for Dutch.
org.apache.lucene.analysis.payloads	Provides various convenience classes for creating payloads on Tokens.
org.apache.lucene.analysis.position	Filter for assigning position increments.
org.apache.lucene.analysis.query	Automatically filter high-frequency stopwords.
org.apache.lucene.analysis.reverse	Filter to reverse token text.
org.apache.lucene.analysis.ru	Analyzer for Russian.
org.apache.lucene.analysis.shingle	Word n-gram filters
org.apache.lucene.analysis.sinks	Implementations of the SinkTokenizer that might be useful.
org.apache.lucene.analysis.th	Analyzer for Thai.

contrib: Ant
org.apache.lucene.ant	Ant task to create Lucene indexes.

contrib: Benchmark
org.apache.lucene.benchmark	The benchmark contribution contains tools for benchmarking Lucene using standard, freely available corpora.
org.apache.lucene.benchmark.byTask	Benchmarking Lucene By Tasks.
org.apache.lucene.benchmark.byTask.feeds	Sources for benchmark inputs: documents and queries.
org.apache.lucene.benchmark.byTask.programmatic	Sample performance test written programmatically - no algorithm file is needed here.
org.apache.lucene.benchmark.byTask.stats	Statistics maintained when running benchmark tasks.
org.apache.lucene.benchmark.byTask.tasks	Extendable benchmark tasks.
org.apache.lucene.benchmark.byTask.utils	Utilities used for the benchmark, and for the reports.
org.apache.lucene.benchmark.quality	Search Quality Benchmarking.
org.apache.lucene.benchmark.quality.trec	Utilities for Trec related quality benchmarking, feeding from Trec Topics and QRels inputs.
org.apache.lucene.benchmark.quality.utils	Miscellaneous utilities for search quality benchmarking: query parsing, submission reports.
org.apache.lucene.benchmark.stats
org.apache.lucene.benchmark.utils

contrib: Collation
org.apache.lucene.collation	`CollationKeyFilter` and `ICUCollationKeyFilter` convert each token into its binary `CollationKey` using the provided `Collator`, and then encode the `CollationKey` as a String using `IndexableBinaryStringTools`, to allow it to be stored as an index term.

contrib: DB
com.sleepycat.db
org.apache.lucene.store.db	Berkeley DB 4.3 based implementation of `Directory`.
org.apache.lucene.store.je	Berkeley DB Java Edition based implementation of `Directory`.

contrib: Fast Vector Highlighter
org.apache.lucene.search.vectorhighlight	This is an another highlighter implementation.

contrib: Highlighter
org.apache.lucene.search.highlight	The highlight package contains classes to provide "keyword in context" features typically used to highlight search terms in the text of results pages.

contrib: Instantiated
org.apache.lucene.store.instantiated	InstantiatedIndex, alternative RAM store for small corpora.

contrib: Lucli
lucli	Lucene Command Line Interface

contrib: Memory
org.apache.lucene.index.memory	High-performance single-document main memory Apache Lucene fulltext search index.

contrib: Misc
org.apache.lucene.misc
org.apache.lucene.queryParser.analyzing	QueryParser that passes Fuzzy-, Prefix-, Range-, and WildcardQuerys through the given analyzer.
org.apache.lucene.queryParser.precedence	QueryParser designed to handle operator precedence in a more sensible fashion than the default QueryParser.

contrib: Queries
org.apache.lucene.search.similar	Document similarity query generators.

contrib: Query Parser
org.apache.lucene.queryParser.complexPhrase	QueryParser which permits complex phrase query syntax eg "(john jon jonathan~) peters*"
org.apache.lucene.queryParser.core	Contains the core classes of the flexible query parser framework
org.apache.lucene.queryParser.core.builders	Contains the necessary classes to implement query builders
org.apache.lucene.queryParser.core.config	Contains the base classes used to configure the query processing
org.apache.lucene.queryParser.core.messages	Contains messages usually used by query parser implementations
org.apache.lucene.queryParser.core.nodes	Contains query nodes that are commonly used by query parser implementations
org.apache.lucene.queryParser.core.parser	Contains the necessary interfaces to implement text parsers
org.apache.lucene.queryParser.core.processors	Interfaces and implementations used by query node processors
org.apache.lucene.queryParser.core.util	Utility classes to used with the Query Parser
org.apache.lucene.queryParser.standard	Contains the implementation of the Lucene query parser using the flexible query parser frameworks
org.apache.lucene.queryParser.standard.builders	Standard Lucene Query Node Builders
org.apache.lucene.queryParser.standard.config	Standard Lucene Query Configuration
org.apache.lucene.queryParser.standard.nodes	Standard Lucene Query Nodes
org.apache.lucene.queryParser.standard.parser	Lucene Query Parser
org.apache.lucene.queryParser.standard.processors	Lucene Query Node Processors

contrib: RegEx
org.apache.lucene.search.regex	Regular expression Query.
org.apache.regexp	This package exists to allow access to useful package protected data within Jakarta Regexp.

contrib: Snowball
org.apache.lucene.analysis.snowball	`TokenFilter` and `Analyzer` implementations that use Snowball stemmers.

contrib: Spatial
org.apache.lucene.spatial.geohash	Support for Geohash encoding, decoding, and filtering.
org.apache.lucene.spatial.geometry
org.apache.lucene.spatial.geometry.shape
org.apache.lucene.spatial.tier	Support for filtering based upon geographic location.
org.apache.lucene.spatial.tier.projections

contrib: SpellChecker
org.apache.lucene.search.spell	Suggest alternate spellings for words.

contrib: Surround Parser
org.apache.lucene.queryParser.surround.parser	This package contains the QueryParser.jj source file for the Surround parser.
org.apache.lucene.queryParser.surround.query	This package contains SrndQuery and its subclasses.

contrib: Swing
org.apache.lucene.swing.models	Decorators for JTable TableModel and JList ListModel encapsulating Lucene indexing and searching functionality.

contrib: Wikipedia
org.apache.lucene.wikipedia.analysis	Tokenizer that is aware of Wikipedia syntax.

contrib: WordNet
org.apache.lucene.wordnet	This package uses synonyms defined by WordNet.

contrib: XML Query Parser
org.apache.lucene.xmlparser	Parser that produces Lucene Query objects from XML streams.
org.apache.lucene.xmlparser.builders

Other Packages
org.tartarus.snowball
org.tartarus.snowball.ext


    Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);



    // Store the index in memory:

    Directory directory = new RAMDirectory();

    // To store an index on disk, use this instead:

    //Directory directory = FSDirectory.open("/tmp/testindex");

    IndexWriter iwriter = new IndexWriter(directory, analyzer, true,

                                          new IndexWriter.MaxFieldLength(25000));

    Document doc = new Document();

    String text = "This is the text to be indexed.";

    doc.add(new Field("fieldname", text, Field.Store.YES,

        Field.Index.ANALYZED));

    iwriter.addDocument(doc);

    iwriter.close();

    

    // Now search the index:

    IndexSearcher isearcher = new IndexSearcher(directory, true); // read-only=true

    // Parse a simple query that searches for "text":

    QueryParser parser = new QueryParser("fieldname", analyzer);

    Query query = parser.parse("text");

    ScoreDoc[] hits = isearcher.search(query, null, 1000).scoreDocs;

    assertEquals(1, hits.length);

    // Iterate through the results:

    for (int i = 0; i < hits.length; i++) {

      Document hitDoc = isearcher.doc(hits[i].doc);

      assertEquals("This is the text to be indexed.", hitDoc.get("fieldname"));

    }

    isearcher.close();

    directory.close();