| 
||||||||||
| PREV NEXT | FRAMES NO FRAMES | |||||||||
See:
          Description
| Core | |
|---|---|
| org.apache.lucene | Top-level package. | 
| org.apache.lucene.analysis | API and code to convert text into indexable/searchable tokens. | 
| org.apache.lucene.analysis.standard | A fast grammar-based tokenizer constructed with JFlex. | 
| org.apache.lucene.analysis.tokenattributes | |
| org.apache.lucene.document | The logical representation of a Document for indexing and searching. | 
| org.apache.lucene.index | Code to maintain and access indices. | 
| org.apache.lucene.messages | For Native Language Support (NLS), system of software internationalization. | 
| org.apache.lucene.queryParser | A simple query parser implemented with JavaCC. | 
| org.apache.lucene.search | Code to search indices. | 
| org.apache.lucene.search.function | 
  Programmatic control over documents scores.  | 
| org.apache.lucene.search.payloads | The payloads package provides Query mechanisms for finding and using payloads.  | 
| org.apache.lucene.search.spans | The calculus of spans. | 
| org.apache.lucene.store | Binary i/o API, used for all index data. | 
| org.apache.lucene.util | Some utility classes. | 
| org.apache.lucene.util.cache | |
| Demo | |
|---|---|
| org.apache.lucene.demo | |
| org.apache.lucene.demo.html | |
| contrib: Analysis | |
|---|---|
| org.apache.lucene.analysis.ar | Analyzer for Arabic. | 
| org.apache.lucene.analysis.br | Analyzer for Brazilian Portuguese. | 
| org.apache.lucene.analysis.cjk | Analyzer for Chinese, Japanese, and Korean, which indexes bigrams (overlapping groups of two adjacent Han characters). | 
| org.apache.lucene.analysis.cn | Analyzer for Chinese, which indexes unigrams (individual chinese characters). | 
| org.apache.lucene.analysis.cn.smart | 
Analyzer for Simplified Chinese, which indexes words.  | 
| org.apache.lucene.analysis.cn.smart.hhmm | 
SmartChineseAnalyzer Hidden Markov Model package.  | 
| org.apache.lucene.analysis.compound | A filter that decomposes compound words you find in many Germanic languages into the word parts. | 
| org.apache.lucene.analysis.compound.hyphenation | The code for the compound word hyphenation is taken from the Apache FOP project. | 
| org.apache.lucene.analysis.cz | Analyzer for Czech. | 
| org.apache.lucene.analysis.de | Analyzer for German. | 
| org.apache.lucene.analysis.el | Analyzer for Greek. | 
| org.apache.lucene.analysis.fa | Analyzer for Persian. | 
| org.apache.lucene.analysis.fr | Analyzer for French. | 
| org.apache.lucene.analysis.miscellaneous | Miscellaneous TokenStreams | 
| org.apache.lucene.analysis.ngram | Character n-gram tokenizers and filters. | 
| org.apache.lucene.analysis.nl | Analyzer for Dutch. | 
| org.apache.lucene.analysis.payloads | Provides various convenience classes for creating payloads on Tokens.  | 
| org.apache.lucene.analysis.position | Filter for assigning position increments. | 
| org.apache.lucene.analysis.query | Automatically filter high-frequency stopwords. | 
| org.apache.lucene.analysis.reverse | Filter to reverse token text. | 
| org.apache.lucene.analysis.ru | Analyzer for Russian. | 
| org.apache.lucene.analysis.shingle | Word n-gram filters | 
| org.apache.lucene.analysis.sinks | Implementations of the SinkTokenizer that might be useful.  | 
| org.apache.lucene.analysis.th | Analyzer for Thai. | 
| contrib: Ant | |
|---|---|
| org.apache.lucene.ant | Ant task to create Lucene indexes. | 
| contrib: Benchmark | |
|---|---|
| org.apache.lucene.benchmark | 
    
    The benchmark contribution contains tools for benchmarking Lucene using standard, freely available corpora.  | 
| org.apache.lucene.benchmark.byTask | 
Benchmarking Lucene By Tasks.  | 
| org.apache.lucene.benchmark.byTask.feeds | Sources for benchmark inputs: documents and queries. | 
| org.apache.lucene.benchmark.byTask.programmatic | Sample performance test written programmatically - no algorithm file is needed here. | 
| org.apache.lucene.benchmark.byTask.stats | Statistics maintained when running benchmark tasks. | 
| org.apache.lucene.benchmark.byTask.tasks | Extendable benchmark tasks. | 
| org.apache.lucene.benchmark.byTask.utils | Utilities used for the benchmark, and for the reports. | 
| org.apache.lucene.benchmark.quality | Search Quality Benchmarking. | 
| org.apache.lucene.benchmark.quality.trec | Utilities for Trec related quality benchmarking, feeding from Trec Topics and QRels inputs. | 
| org.apache.lucene.benchmark.quality.utils | Miscellaneous utilities for search quality benchmarking: query parsing, submission reports. | 
| org.apache.lucene.benchmark.stats | |
| org.apache.lucene.benchmark.utils | |
| contrib: Collation | |
|---|---|
| org.apache.lucene.collation | 
  CollationKeyFilter and ICUCollationKeyFilter
  convert each token into its binary CollationKey using the 
  provided Collator, and then encode the CollationKey
  as a String using
  IndexableBinaryStringTools, to allow it to be 
  stored as an index term. | 
| contrib: DB | |
|---|---|
| com.sleepycat.db | |
| org.apache.lucene.store.db | Berkeley DB 4.3 based implementation of Directory. | 
| org.apache.lucene.store.je | Berkeley DB Java Edition based implementation of Directory. | 
| contrib: Fast Vector Highlighter | |
|---|---|
| org.apache.lucene.search.vectorhighlight | This is an another highlighter implementation. | 
| contrib: Highlighter | |
|---|---|
| org.apache.lucene.search.highlight | The highlight package contains classes to provide "keyword in context" features typically used to highlight search terms in the text of results pages. | 
| contrib: Instantiated | |
|---|---|
| org.apache.lucene.store.instantiated | InstantiatedIndex, alternative RAM store for small corpora. | 
| contrib: Lucli | |
|---|---|
| lucli | Lucene Command Line Interface | 
| contrib: Memory | |
|---|---|
| org.apache.lucene.index.memory | High-performance single-document main memory Apache Lucene fulltext search index. | 
| contrib: Misc | |
|---|---|
| org.apache.lucene.misc | |
| org.apache.lucene.queryParser.analyzing | QueryParser that passes Fuzzy-, Prefix-, Range-, and WildcardQuerys through the given analyzer. | 
| org.apache.lucene.queryParser.precedence | QueryParser designed to handle operator precedence in a more sensible fashion than the default QueryParser. | 
| contrib: Queries | |
|---|---|
| org.apache.lucene.search.similar | Document similarity query generators. | 
| contrib: Query Parser | |
|---|---|
| org.apache.lucene.queryParser.complexPhrase | QueryParser which permits complex phrase query syntax eg "(john jon jonathan~) peters*" | 
| org.apache.lucene.queryParser.core | Contains the core classes of the flexible query parser framework | 
| org.apache.lucene.queryParser.core.builders | Contains the necessary classes to implement query builders | 
| org.apache.lucene.queryParser.core.config | Contains the base classes used to configure the query processing | 
| org.apache.lucene.queryParser.core.messages | Contains messages usually used by query parser implementations | 
| org.apache.lucene.queryParser.core.nodes | Contains query nodes that are commonly used by query parser implementations | 
| org.apache.lucene.queryParser.core.parser | Contains the necessary interfaces to implement text parsers | 
| org.apache.lucene.queryParser.core.processors | Interfaces and implementations used by query node processors | 
| org.apache.lucene.queryParser.core.util | Utility classes to used with the Query Parser | 
| org.apache.lucene.queryParser.standard | Contains the implementation of the Lucene query parser using the flexible query parser frameworks | 
| org.apache.lucene.queryParser.standard.builders | Standard Lucene Query Node Builders | 
| org.apache.lucene.queryParser.standard.config | Standard Lucene Query Configuration | 
| org.apache.lucene.queryParser.standard.nodes | Standard Lucene Query Nodes | 
| org.apache.lucene.queryParser.standard.parser | Lucene Query Parser | 
| org.apache.lucene.queryParser.standard.processors | Lucene Query Node Processors | 
| contrib: RegEx | |
|---|---|
| org.apache.lucene.search.regex | Regular expression Query. | 
| org.apache.regexp | This package exists to allow access to useful package protected data within Jakarta Regexp. | 
| contrib: Snowball | |
|---|---|
| org.apache.lucene.analysis.snowball | TokenFilter and Analyzer implementations that use Snowball
stemmers. | 
| contrib: Spatial | |
|---|---|
| org.apache.lucene.spatial.geohash | Support for Geohash encoding, decoding, and filtering. | 
| org.apache.lucene.spatial.geometry | |
| org.apache.lucene.spatial.geometry.shape | |
| org.apache.lucene.spatial.tier | Support for filtering based upon geographic location. | 
| org.apache.lucene.spatial.tier.projections | |
| contrib: SpellChecker | |
|---|---|
| org.apache.lucene.search.spell | Suggest alternate spellings for words. | 
| contrib: Surround Parser | |
|---|---|
| org.apache.lucene.queryParser.surround.parser | This package contains the QueryParser.jj source file for the Surround parser. | 
| org.apache.lucene.queryParser.surround.query | This package contains SrndQuery and its subclasses. | 
| contrib: Swing | |
|---|---|
| org.apache.lucene.swing.models | Decorators for JTable TableModel and JList ListModel encapsulating Lucene indexing and searching functionality. | 
| contrib: Wikipedia | |
|---|---|
| org.apache.lucene.wikipedia.analysis | Tokenizer that is aware of Wikipedia syntax. | 
| contrib: WordNet | |
|---|---|
| org.apache.lucene.wordnet | This package uses synonyms defined by WordNet. | 
| contrib: XML Query Parser | |
|---|---|
| org.apache.lucene.xmlparser | Parser that produces Lucene Query objects from XML streams. | 
| org.apache.lucene.xmlparser.builders | |
| Other Packages | |
|---|---|
| org.tartarus.snowball | |
| org.tartarus.snowball.ext | |
Apache Lucene is a high-performance, full-featured text search engine library. Here's a simple example how to use Lucene for indexing and searching (using JUnit to check if the results are what we expect):
    
    Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);
    
    | 
  
   
The Lucene API is divided into several packages:
> java -cp lucene.jar:lucene-demo.jar org.apache.lucene.demo.IndexFiles rec.food.recipes/soupsThe IndexHTML demo is more sophisticated. It incrementally maintains an index of HTML files, adding new files as they appear, deleting old files as they disappear and re-indexing files as they change.
adding rec.food.recipes/soups/abalone-chowder
[ ... ]> java -cp lucene.jar:lucene-demo.jar org.apache.lucene.demo.SearchFiles
Query: chowder
Searching for: chowder
34 total matching documents
1. rec.food.recipes/soups/spam-chowder
[ ... thirty-four documents contain the word "chowder" ... ]Query: "clam chowder" AND Manhattan
Searching for: +"clam chowder" +manhattan
2 total matching documents
1. rec.food.recipes/soups/clam-chowder
[ ... two documents contain the phrase "clam chowder" and the word "manhattan" ... ]
[ Note: "+" and "-" are canonical, but "AND", "OR" and "NOT" may be used. ]
> java -cp lucene.jar:lucene-demo.jar org.apache.lucene.demo.IndexHTML -create java/jdk1.1.6/docs/relnotes
adding java/jdk1.1.6/docs/relnotes/SMICopyright.html
[ ... create an index containing all the relnotes ]> rm java/jdk1.1.6/docs/relnotes/smicopyright.html
> java -cp lucene.jar:lucene-demo.jar org.apache.lucene.demo.IndexHTML java/jdk1.1.6/docs/relnotes
deleting java/jdk1.1.6/docs/relnotes/SMICopyright.html
  | 
||||||||||
| PREV NEXT | FRAMES NO FRAMES | |||||||||