|
||||||||||
PREV NEXT | FRAMES NO FRAMES |
See:
Description
Core | |
---|---|
org.apache.lucene | Top-level package. |
org.apache.lucene.analysis | API and code to convert text into indexable/searchable tokens. |
org.apache.lucene.analysis.standard | A fast grammar-based tokenizer constructed with JFlex. |
org.apache.lucene.analysis.tokenattributes | |
org.apache.lucene.document | The logical representation of a Document for indexing and searching. |
org.apache.lucene.index | Code to maintain and access indices. |
org.apache.lucene.messages | For Native Language Support (NLS), system of software internationalization. |
org.apache.lucene.queryParser | A simple query parser implemented with JavaCC. |
org.apache.lucene.search | Code to search indices. |
org.apache.lucene.search.function |
Programmatic control over documents scores. |
org.apache.lucene.search.payloads | The payloads package provides Query mechanisms for finding and using payloads. |
org.apache.lucene.search.spans | The calculus of spans. |
org.apache.lucene.store | Binary i/o API, used for all index data. |
org.apache.lucene.util | Some utility classes. |
org.apache.lucene.util.cache |
Demo | |
---|---|
org.apache.lucene.demo | |
org.apache.lucene.demo.html |
contrib: Analysis | |
---|---|
org.apache.lucene.analysis.ar | Analyzer for Arabic. |
org.apache.lucene.analysis.br | Analyzer for Brazilian Portuguese. |
org.apache.lucene.analysis.cjk | Analyzer for Chinese, Japanese, and Korean, which indexes bigrams (overlapping groups of two adjacent Han characters). |
org.apache.lucene.analysis.cn | Analyzer for Chinese, which indexes unigrams (individual chinese characters). |
org.apache.lucene.analysis.cn.smart |
Analyzer for Simplified Chinese, which indexes words. |
org.apache.lucene.analysis.cn.smart.hhmm |
SmartChineseAnalyzer Hidden Markov Model package. |
org.apache.lucene.analysis.compound | A filter that decomposes compound words you find in many Germanic languages into the word parts. |
org.apache.lucene.analysis.compound.hyphenation | The code for the compound word hyphenation is taken from the Apache FOP project. |
org.apache.lucene.analysis.cz | Analyzer for Czech. |
org.apache.lucene.analysis.de | Analyzer for German. |
org.apache.lucene.analysis.el | Analyzer for Greek. |
org.apache.lucene.analysis.fa | Analyzer for Persian. |
org.apache.lucene.analysis.fr | Analyzer for French. |
org.apache.lucene.analysis.miscellaneous | Miscellaneous TokenStreams |
org.apache.lucene.analysis.ngram | Character n-gram tokenizers and filters. |
org.apache.lucene.analysis.nl | Analyzer for Dutch. |
org.apache.lucene.analysis.payloads | Provides various convenience classes for creating payloads on Tokens. |
org.apache.lucene.analysis.position | Filter for assigning position increments. |
org.apache.lucene.analysis.query | Automatically filter high-frequency stopwords. |
org.apache.lucene.analysis.reverse | Filter to reverse token text. |
org.apache.lucene.analysis.ru | Analyzer for Russian. |
org.apache.lucene.analysis.shingle | Word n-gram filters |
org.apache.lucene.analysis.sinks | Implementations of the SinkTokenizer that might be useful. |
org.apache.lucene.analysis.th | Analyzer for Thai. |
contrib: Ant | |
---|---|
org.apache.lucene.ant | Ant task to create Lucene indexes. |
contrib: Benchmark | |
---|---|
org.apache.lucene.benchmark |
The benchmark contribution contains tools for benchmarking Lucene using standard, freely available corpora. |
org.apache.lucene.benchmark.byTask |
Benchmarking Lucene By Tasks. |
org.apache.lucene.benchmark.byTask.feeds | Sources for benchmark inputs: documents and queries. |
org.apache.lucene.benchmark.byTask.programmatic | Sample performance test written programmatically - no algorithm file is needed here. |
org.apache.lucene.benchmark.byTask.stats | Statistics maintained when running benchmark tasks. |
org.apache.lucene.benchmark.byTask.tasks | Extendable benchmark tasks. |
org.apache.lucene.benchmark.byTask.utils | Utilities used for the benchmark, and for the reports. |
org.apache.lucene.benchmark.quality | Search Quality Benchmarking. |
org.apache.lucene.benchmark.quality.trec | Utilities for Trec related quality benchmarking, feeding from Trec Topics and QRels inputs. |
org.apache.lucene.benchmark.quality.utils | Miscellaneous utilities for search quality benchmarking: query parsing, submission reports. |
org.apache.lucene.benchmark.stats | |
org.apache.lucene.benchmark.utils |
contrib: Collation | |
---|---|
org.apache.lucene.collation |
CollationKeyFilter and ICUCollationKeyFilter
convert each token into its binary CollationKey using the
provided Collator , and then encode the CollationKey
as a String using
IndexableBinaryStringTools , to allow it to be
stored as an index term. |
contrib: DB | |
---|---|
com.sleepycat.db | |
org.apache.lucene.store.db | Berkeley DB 4.3 based implementation of Directory . |
org.apache.lucene.store.je | Berkeley DB Java Edition based implementation of Directory . |
contrib: Fast Vector Highlighter | |
---|---|
org.apache.lucene.search.vectorhighlight | This is an another highlighter implementation. |
contrib: Highlighter | |
---|---|
org.apache.lucene.search.highlight | The highlight package contains classes to provide "keyword in context" features typically used to highlight search terms in the text of results pages. |
contrib: Instantiated | |
---|---|
org.apache.lucene.store.instantiated | InstantiatedIndex, alternative RAM store for small corpora. |
contrib: Lucli | |
---|---|
lucli | Lucene Command Line Interface |
contrib: Memory | |
---|---|
org.apache.lucene.index.memory | High-performance single-document main memory Apache Lucene fulltext search index. |
contrib: Misc | |
---|---|
org.apache.lucene.misc | |
org.apache.lucene.queryParser.analyzing | QueryParser that passes Fuzzy-, Prefix-, Range-, and WildcardQuerys through the given analyzer. |
org.apache.lucene.queryParser.precedence | QueryParser designed to handle operator precedence in a more sensible fashion than the default QueryParser. |
contrib: Queries | |
---|---|
org.apache.lucene.search.similar | Document similarity query generators. |
contrib: Query Parser | |
---|---|
org.apache.lucene.queryParser.complexPhrase | QueryParser which permits complex phrase query syntax eg "(john jon jonathan~) peters*" |
org.apache.lucene.queryParser.core | Contains the core classes of the flexible query parser framework |
org.apache.lucene.queryParser.core.builders | Contains the necessary classes to implement query builders |
org.apache.lucene.queryParser.core.config | Contains the base classes used to configure the query processing |
org.apache.lucene.queryParser.core.messages | Contains messages usually used by query parser implementations |
org.apache.lucene.queryParser.core.nodes | Contains query nodes that are commonly used by query parser implementations |
org.apache.lucene.queryParser.core.parser | Contains the necessary interfaces to implement text parsers |
org.apache.lucene.queryParser.core.processors | Interfaces and implementations used by query node processors |
org.apache.lucene.queryParser.core.util | Utility classes to used with the Query Parser |
org.apache.lucene.queryParser.standard | Contains the implementation of the Lucene query parser using the flexible query parser frameworks |
org.apache.lucene.queryParser.standard.builders | Standard Lucene Query Node Builders |
org.apache.lucene.queryParser.standard.config | Standard Lucene Query Configuration |
org.apache.lucene.queryParser.standard.nodes | Standard Lucene Query Nodes |
org.apache.lucene.queryParser.standard.parser | Lucene Query Parser |
org.apache.lucene.queryParser.standard.processors | Lucene Query Node Processors |
contrib: RegEx | |
---|---|
org.apache.lucene.search.regex | Regular expression Query. |
org.apache.regexp | This package exists to allow access to useful package protected data within Jakarta Regexp. |
contrib: Snowball | |
---|---|
org.apache.lucene.analysis.snowball | TokenFilter and Analyzer implementations that use Snowball
stemmers. |
contrib: Spatial | |
---|---|
org.apache.lucene.spatial.geohash | Support for Geohash encoding, decoding, and filtering. |
org.apache.lucene.spatial.geometry | |
org.apache.lucene.spatial.geometry.shape | |
org.apache.lucene.spatial.tier | Support for filtering based upon geographic location. |
org.apache.lucene.spatial.tier.projections |
contrib: SpellChecker | |
---|---|
org.apache.lucene.search.spell | Suggest alternate spellings for words. |
contrib: Surround Parser | |
---|---|
org.apache.lucene.queryParser.surround.parser | This package contains the QueryParser.jj source file for the Surround parser. |
org.apache.lucene.queryParser.surround.query | This package contains SrndQuery and its subclasses. |
contrib: Swing | |
---|---|
org.apache.lucene.swing.models | Decorators for JTable TableModel and JList ListModel encapsulating Lucene indexing and searching functionality. |
contrib: Wikipedia | |
---|---|
org.apache.lucene.wikipedia.analysis | Tokenizer that is aware of Wikipedia syntax. |
contrib: WordNet | |
---|---|
org.apache.lucene.wordnet | This package uses synonyms defined by WordNet. |
contrib: XML Query Parser | |
---|---|
org.apache.lucene.xmlparser | Parser that produces Lucene Query objects from XML streams. |
org.apache.lucene.xmlparser.builders |
Other Packages | |
---|---|
org.tartarus.snowball | |
org.tartarus.snowball.ext |
Apache Lucene is a high-performance, full-featured text search engine library. Here's a simple example how to use Lucene for indexing and searching (using JUnit to check if the results are what we expect):
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);
|
The Lucene API is divided into several packages:
> java -cp lucene.jar:lucene-demo.jar org.apache.lucene.demo.IndexFiles rec.food.recipes/soupsThe IndexHTML demo is more sophisticated. It incrementally maintains an index of HTML files, adding new files as they appear, deleting old files as they disappear and re-indexing files as they change.
adding rec.food.recipes/soups/abalone-chowder
[ ... ]> java -cp lucene.jar:lucene-demo.jar org.apache.lucene.demo.SearchFiles
Query: chowder
Searching for: chowder
34 total matching documents
1. rec.food.recipes/soups/spam-chowder
[ ... thirty-four documents contain the word "chowder" ... ]Query: "clam chowder" AND Manhattan
Searching for: +"clam chowder" +manhattan
2 total matching documents
1. rec.food.recipes/soups/clam-chowder
[ ... two documents contain the phrase "clam chowder" and the word "manhattan" ... ]
[ Note: "+" and "-" are canonical, but "AND", "OR" and "NOT" may be used. ]
> java -cp lucene.jar:lucene-demo.jar org.apache.lucene.demo.IndexHTML -create java/jdk1.1.6/docs/relnotes
adding java/jdk1.1.6/docs/relnotes/SMICopyright.html
[ ... create an index containing all the relnotes ]> rm java/jdk1.1.6/docs/relnotes/smicopyright.html
> java -cp lucene.jar:lucene-demo.jar org.apache.lucene.demo.IndexHTML java/jdk1.1.6/docs/relnotes
deleting java/jdk1.1.6/docs/relnotes/SMICopyright.html
|
||||||||||
PREV NEXT | FRAMES NO FRAMES |