See: Description
Package | Description |
org.apache.lucene |
Top-level package.
org.apache.lucene.analysis |
API and code to convert text into indexable/searchable tokens.
org.apache.lucene.analysis.standard |
Standards-based analyzers implemented with JFlex.
org.apache.lucene.analysis.standard.std31 |
Backwards-compatible implementation to match
Version.LUCENE_31 |
org.apache.lucene.analysis.standard.std34 |
Backwards-compatible implementation to match
Version.LUCENE_34 |
org.apache.lucene.analysis.tokenattributes |
Attribute s for text analysis. |
org.apache.lucene.document |
The logical representation of a
Document for indexing and searching. |
org.apache.lucene.index |
Code to maintain and access indices.
org.apache.lucene.messages |
For Native Language Support (NLS), system of software internationalization.
org.apache.lucene.queryParser |
A simple query parser implemented with JavaCC.
| |
Code to search indices.
| |
Programmatic control over documents scores.
| |
The payloads package provides Query mechanisms for finding and using payloads.
| |
The calculus of spans.
| |
Binary i/o API, used for all index data.
org.apache.lucene.util |
Some utility classes.
org.apache.lucene.util.collections |
Various optimized Collections implementations.
org.apache.lucene.util.encoding |
Offers various encoders and decoders for integers, as well as the
mechanisms to create new ones.
org.apache.lucene.util.fst |
Finite state transducers
org.apache.lucene.util.packed |
The packed package provides random access capable arrays of positive longs.
Package | Description |
---|---| |
Analyzer for Arabic.
| |
Analyzer for Bulgarian.
| |
Analyzer for Brazilian Portuguese.
| |
Analyzer for Catalan.
org.apache.lucene.analysis.charfilter |
CharFilters: process text before the Tokenizer
org.apache.lucene.analysis.cjk |
Analyzer for Chinese, Japanese, and Korean, which indexes bigrams (overlapping groups of two adjacent Han characters).
| |
Analyzer for Chinese, which indexes unigrams (individual chinese characters).
| |
Analyzer for Simplified Chinese, which indexes words.
| |
SmartChineseAnalyzer Hidden Markov Model package.
org.apache.lucene.analysis.compound |
A filter that decomposes compound words you find in many Germanic
languages into the word parts.
org.apache.lucene.analysis.compound.hyphenation |
The code for the compound word hyphenation is taken from the Apache FOP project.
| |
Analyzer for Czech.
org.apache.lucene.analysis.da |
Analyzer for Danish.
| |
Analyzer for German.
org.apache.lucene.analysis.el |
Analyzer for Greek.
org.apache.lucene.analysis.en |
Analyzer for English.
| |
Analyzer for Spanish.
| |
Analyzer for Basque.
org.apache.lucene.analysis.fa |
Analyzer for Persian.
| |
Analyzer for Finnish.
| |
Analyzer for French.
| |
Analysis for Irish.
| |
Analyzer for Galician.
org.apache.lucene.analysis.hi |
Analyzer for Hindi.
| |
Analyzer for Hungarian.
org.apache.lucene.analysis.hunspell |
Stemming TokenFilter using a Java implementation of the
Hunspell stemming algorithm.
org.apache.lucene.analysis.hy |
Analyzer for Armenian.
| |
Analysis components based on ICU
| |
Tokenizer that breaks text into words with the Unicode Text Segmentation algorithm.
| |
Additional ICU-specific Attributes for text analysis.
| |
Analyzer for Indonesian.
| |
Analysis components for Indian languages.
| |
Analyzer for Italian.
org.apache.lucene.analysis.ja |
Analyzer for Japanese.
org.apache.lucene.analysis.ja.dict |
Kuromoji dictionary implementation.
org.apache.lucene.analysis.ja.tokenattributes |
Additional Kuromoji-specific Attributes for text analysis.
org.apache.lucene.analysis.ja.util |
Kuromoji utility classes.
| |
Analyzer for Latvian.
org.apache.lucene.analysis.miscellaneous |
Miscellaneous TokenStreams
org.apache.lucene.analysis.ngram |
Character n-gram tokenizers and filters.
| |
Analyzer for Dutch.
| |
Analyzer for Norwegian.
org.apache.lucene.analysis.path |
Analysis components for path-like strings such as filenames.
org.apache.lucene.analysis.payloads |
Provides various convenience classes for creating payloads on Tokens.
org.apache.lucene.analysis.phonetic |
Analysis components for phonetic search.
| |
Analyzer for Polish.
org.apache.lucene.analysis.position |
Filter for assigning position increments.
| |
Analyzer for Portuguese.
org.apache.lucene.analysis.query |
Automatically filter high-frequency stopwords.
org.apache.lucene.analysis.reverse |
Filter to reverse token text.
| |
Analyzer for Romanian.
| |
Analyzer for Russian.
org.apache.lucene.analysis.shingle |
Word n-gram filters
org.apache.lucene.analysis.sinks |
Implementations of the SinkTokenizer that might be useful.
org.apache.lucene.analysis.snowball |
TokenFilter and Analyzer implementations that use Snowball
stemmers. |
org.apache.lucene.analysis.stempel |
Stempel: Algorithmic Stemmer
| |
Analyzer for Swedish.
org.apache.lucene.analysis.synonym |
Analysis components for Synonyms.
| |
Analyzer for Thai.
| |
Analyzer for Turkish.
org.apache.lucene.analysis.util |
Utility functions for text analysis.
org.apache.lucene.analysis.wikipedia |
Tokenizer that is aware of Wikipedia syntax.
org.egothor.stemmer |
Egothor stemmer API.
org.tartarus.snowball |
Snowball stemmer API.
org.tartarus.snowball.ext |
Autogenerated snowball stemmer implementations.
Package | Description |
org.apache.lucene.benchmark |
The benchmark contribution contains tools for benchmarking Lucene using standard, freely available corpora.
org.apache.lucene.benchmark.byTask |
Benchmarking Lucene By Tasks.
org.apache.lucene.benchmark.byTask.feeds |
Sources for benchmark inputs: documents and queries.
org.apache.lucene.benchmark.byTask.feeds.demohtml |
Example html parser based on JavaCC
org.apache.lucene.benchmark.byTask.programmatic |
Sample performance test written programmatically - no algorithm file is needed here.
org.apache.lucene.benchmark.byTask.stats |
Statistics maintained when running benchmark tasks.
org.apache.lucene.benchmark.byTask.tasks |
Extendable benchmark tasks.
org.apache.lucene.benchmark.byTask.utils |
Utilities used for the benchmark, and for the reports.
org.apache.lucene.benchmark.quality |
Search Quality Benchmarking.
org.apache.lucene.benchmark.quality.trec |
Utilities for Trec related quality benchmarking, feeding from Trec Topics and QRels inputs.
org.apache.lucene.benchmark.quality.utils |
Miscellaneous utilities for search quality benchmarking: query parsing, submission reports.
org.apache.lucene.benchmark.utils |
Benchmark Utility functions.
Package | Description |
org.apache.lucene.collation |
converts each token into its binary CollationKey using the
provided Collator , and then encode the CollationKey
as a String using
IndexableBinaryStringTools , to allow it to be
stored as an index term. |
Package | Description |
org.apache.lucene.demo |
Demo applications for indexing and searching.
Package | Description |
org.apache.lucene.facet |
Provides faceted indexing and search capabilities.
org.apache.lucene.facet.enhancements |
Enhanced category features
Mechanisms for addition of enhanced category features.
org.apache.lucene.facet.enhancements.association |
Association category enhancements
for adding associations data to the index (categories with
AssociationProperty 's). |
org.apache.lucene.facet.enhancements.params |
Enhanced category features
used by
for adding
CategoryEnhancement 's
to the indexing parameters, and accessing them during indexing and search. |
org.apache.lucene.facet.index |
Indexing of document categories
Attachment of
CategoryPath 's
or CategoryAttribute 's
to a given document using a
Taxonomy . |
org.apache.lucene.facet.index.attributes |
Category attributes and their properties for indexing
Attributes for a
category ,
possibly containing
category property 's. |
org.apache.lucene.facet.index.categorypolicy |
Policies for indexing categories
There are two kinds of policies:
Path policies are based on the path of the category.
org.apache.lucene.facet.index.params |
Indexing-time specifications for handling facets
Parameters on how facets are to be written to the index,
such as which fields and terms are used to refer to the facets posting list.
org.apache.lucene.facet.index.streaming |
Expert: attributes streaming definition for indexing facets
Steaming of facets attributes is a low level indexing interface with Lucene indexing.
| |
Faceted Search API
API for faceted search has several interfaces - simple, top level ones, adequate for most users,
and advanced, more complicated ones, for the more advanced users.
| |
Aggregating Facets during Faceted Search
A facets aggregator is the parallel of Lucene's Collector.
| |
Association-based aggregators.
| |
Caching to speed up facets accumulation.
| |
Parameters for Faceted Search
| |
Association-based Parameters for Faceted Search.
| |
Results of Faceted Search
| |
Sampling for facets accumulation
org.apache.lucene.facet.taxonomy |
Taxonomy of Categories
Facets are defined using a hierarchy of categories, known as a
For example, in a book store application, a Taxonomy could have the following hierarchy: Author Mark Twain J. | |
Taxonomy implemented using a Lucene-Index
org.apache.lucene.facet.taxonomy.writercache |
Improves indexing time by caching a map of CategoryPath to their Ordinal
org.apache.lucene.facet.taxonomy.writercache.cl2o |
Category->Ordinal caching implementation using an optimized data-structures
The internal map data structure consumes less memory (~30%) and is faster (~50%) compared to a
Java HashMap<String, Integer>.
org.apache.lucene.facet.taxonomy.writercache.lru |
An LRU cache implementation for the CategoryPath to Ordinal map
org.apache.lucene.facet.util |
Various utilities for faceted search
Package | Description |
---|---| |
This module enables search result grouping with Lucene, where hits
with the same value in the specified single-valued group field are
grouped together.
Package | Description |
---|---| |
The highlight package contains classes to provide "keyword in context" features
typically used to highlight search terms in the text of results pages.
| |
This is an another highlighter implementation.
Package | Description |
---|---| |
InstantiatedIndex, alternative RAM store for small corpora.
Package | Description |
---|---| |
This modules support index-time and query-time joins.
Package | Description |
org.apache.lucene.index.memory |
High-performance single-document main memory Apache Lucene fulltext search index.
Package | Description |
org.apache.lucene.misc |
Miscellaneous index tools.
Package | Description |
org.apache.lucene.index.pruning |
Static Index Pruning Tools
This package provides a framework for pruning an existing index into
a smaller index while retaining visible search quality as much as possible.
Package | Description |
---|---| |
Regular expression Query.
| |
Document similarity query generators.
Package | Description |
org.apache.lucene.queryParser.analyzing |
QueryParser that passes Fuzzy-, Prefix-, Range-, and WildcardQuerys through the given analyzer.
org.apache.lucene.queryParser.complexPhrase |
QueryParser which permits complex phrase query syntax eg "(john jon jonathan~) peters*"
org.apache.lucene.queryParser.core |
Contains the core classes of the flexible query parser framework
Flexible Query Parser
This package contains the necessary classes to implement a query parser.
| |
Contains the necessary classes to implement query builders
Query Parser Builders
The package contains the interface that
builders must implement, it also contain a utility
QueryTreeBuilder , which walks the tree
and call the Builder for each node in the tree. |
org.apache.lucene.queryParser.core.config |
Contains the base classes used to configure the query processing
Query Configuration Interfaces
The package org.apache.lucene.queryParser.core.config contains query configuration handler
abstract class that all config handlers should extend.
org.apache.lucene.queryParser.core.messages |
Contains messages usually used by query parser implementations
Query Parser Messages
Messages for the Flexible Query Parser, they use org.apache.lucene.messages.NLS API.
org.apache.lucene.queryParser.core.nodes |
Contains query nodes that are commonly used by query parser implementations
Query Nodes
The package org.apache.lucene.queryParser.nodes contains all the basic query nodes.
org.apache.lucene.queryParser.core.parser |
Contains the necessary interfaces to implement text parsers
The package org.apache.lucene.queryParser.parser contains interfaces
that should be implemented by the parsers.
org.apache.lucene.queryParser.core.processors |
Interfaces and implementations used by query node processors
Query Node Processors
The package org.apache.lucene.queryParser.processors contains interfaces
that should be implemented by every query node processor.
org.apache.lucene.queryParser.core.util |
Utility classes to used with the Query Parser
Utility classes to used with the Query Parser
This package contains utility classes used with the query parsers.
org.apache.lucene.queryParser.ext |
Extendable QueryParser provides a simple and flexible extension mechanism by overloading query field names.
org.apache.lucene.queryParser.precedence |
This package contains the Precedence Query Parser Implementation
Lucene Precedence Query Parser
The Precedence Query Parser extends the Standard Query Parser and enables
the boolean precedence.
org.apache.lucene.queryParser.precedence.processors |
This package contains the processors used by Precedence Query Parser
Lucene Precedence Query Parser Processors
This package contains the 2
QueryNodeProcessor s used by
PrecedenceQueryParser . |
org.apache.lucene.queryParser.standard |
Contains the implementation of the Lucene query parser using the flexible query parser frameworks
Lucene Flexible Query Parser Implementation
The old Lucene query parser used to have only one class that performed
all the parsing operations.
| |
Standard Lucene Query Node Builders
The package contains all the builders needed
to build a Lucene Query object from a query node tree.
org.apache.lucene.queryParser.standard.config |
Standard Lucene Query Configuration
The package org.apache.lucene.queryParser.standard.config contains the Lucene
query configuration handler (StandardQueryConfigHandler).
org.apache.lucene.queryParser.standard.nodes |
Standard Lucene Query Nodes
The package org.apache.lucene.queryParser.standard.nodes contains QueryNode classes
that are used specifically for Lucene query node tree.
org.apache.lucene.queryParser.standard.parser |
Lucene Query Parser
The package org.apache.lucene.queryParser.standard.parser contains the query parser.
org.apache.lucene.queryParser.standard.processors |
Lucene Query Node Processors
The package org.apache.lucene.queryParser.standard.processors contains every processor needed to assembly a pipeline
that modifies the query node tree according to the actual Lucene queries.
org.apache.lucene.queryParser.surround.parser |
This package contains the QueryParser.jj source file for the Surround parser.
org.apache.lucene.queryParser.surround.query |
This package contains SrndQuery and its subclasses.
Package | Description |
org.apache.lucene.spatial |
Support for geospatial search.
org.apache.lucene.spatial.geohash |
Support for Geohash encoding, decoding, and filtering.
org.apache.lucene.spatial.geometry |
Coordinate and distance representations.
org.apache.lucene.spatial.geometry.shape |
Shape representations.
org.apache.lucene.spatial.tier |
Support for filtering based upon geographic location.
org.apache.lucene.spatial.tier.projections |
Spatial projections.
Package | Description |
---|---| |
Suggest alternate spellings for words.
| |
Support for Autocomplete/Autosuggest
| |
Finite-state based autosuggest.
| |
JaSpell-based autosuggest.
| |
Ternary Search Tree based autosuggest.
Package | Description |
org.apache.lucene.xmlparser |
Parser that produces Lucene Query objects from XML streams.
| |
Builders to support various Lucene queries.