Lucene™ Features

Lucene offers powerful features through a simple API:

Scalable, High-Performance Indexing

  • over 800GB/hour on modern hardware
  • small RAM requirements -- only 1MB heap
  • incremental indexing as fast as batch indexing
  • index size roughly 20-30% the size of text indexed

Powerful, Accurate and Efficient Search Algorithms

  • ranked searching -- best results returned first
  • many powerful query types: phrase queries, wildcard queries, proximity queries, range queries and more
  • fielded searching (e.g. title, author, contents)
  • nearest-neighbor search for high-dimensionality vectors
  • sorting by any field
  • multiple-index searching with merged results
  • allows simultaneous update and searching
  • flexible faceting, highlighting, joins and result grouping
  • fast, memory-efficient and typo-tolerant suggesters
  • pluggable ranking models, including the Vector Space Model and Okapi BM25
  • configurable storage engine (codecs)

Search performance of Apache Lucene is tracked in muliple places. Check out

Cross-Platform Solution

Utility tools

  • integrated desktop GUI tool (Luke): a utility for browsing, searching and maintaining indexes and documents. It can be started with "bin/luke.{sh|cmd}".