LuceneTM Core News

31 October 2014 - Lucene Core 4.10.2 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.10.2

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

Lucene 4.10.2 includes 2 bug fixes.

See the CHANGES.txt file included with the release for a full list of changes and further details, and Happy Halloween!

29 September 2014 - Lucene Core 4.10.1 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.10.1

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

Lucene 4.10.1 includes 7 bug fixes.

See the CHANGES.txt file included with the release for a full list of changes and further details.

22 September 2014 - Lucene Core 4.9.1 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.9.1

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

Lucene 4.9.1 includes 7 bug fixes.

See the CHANGES.txt file included with the release for a full list of changes and further details.

03 September 2014 - Lucene Core 4.10.0 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.10.0

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

See the CHANGES.txt file included with the release for a full list of details.

Lucene 4.10.0 Release Highlights:

  • New TermAutomatonQuery using an automaton for proximity queries. http://blog.mikemccandless.com/2014/08/a-new-proximity-query-for-lucene-using.html

  • New OrdsBlockTree terms dictionary supporting ord lookup.

  • Simplified matchVersion handling for Analyzers with new setVersion method, as well as Analyzer constructors not requiring Version.

  • Fixed possible corruption when opening a 3.x index with NRT reader.

  • Fixed edge case in StandardTokenizer that caused extremely slow parsing times with long text which partially matched grammar rules.

25 June 2014 - Lucene Core 4.9.0 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.9.0

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

See the CHANGES.txt file included with the release for a full list of details.

Lucene 4.9.0 Release Highlights:

  • New Terms.getMin/Max methods to retrieve the lowest and highest terms per field.

  • New IDVersionPostingsFormat, optimized for ID lookups that associate a monotonically increasing version per ID.

  • Atomic update of a set of doc values fields.

  • Numerous optimizations for doc values search-time performance.

  • New (default) Lucene49NormsFormat to better compress certain cases such as very short fields.

  • New SORTED_NUMERIC docvalues type for efficient processing of multi-valued numeric fields.

  • Indexer passes previous token stream for easier reuse.

  • MoreLikeThis accepts multiple values per field.

  • All classes that estimate their RAM usage now implement a new Accountable interface.

  • Lucene files are now written by (File)OutputStream on all platforms, completely disallowing seeking with simplified IO APIs.

  • Improve the confusing error message when MMapDirectory cannot create a new map.

20 May 2014 - Lucene Core 4.8.1 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.8.1

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

Lucene 4.8.1 includes 15 bug fixes.

See the CHANGES.txt file included with the release for a full list of changes and further details.

28 April 2014 - Apache Lucene 4.8.0 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.8.0

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

See the CHANGES.txt file included with the release for a full list of details.

Lucene 4.8.0 Release Highlights:

  • Apache Lucene now requires Java 7 or greater (recommended is Oracle Java 7 or OpenJDK 7, minimum update 55; earlier versions have known JVM bugs affecting Lucene).

  • Apache Lucene is fully compatible with Java 8.

  • All index files now store end-to-end checksums, which are now validated during merging and reading. This ensures that corruptions caused by any bit-flipping hardware problems or bugs in the JVM can be detected earlier. For full detection be sure to enable all checksums during merging (it's disabled by default).

  • Lucene has a new Rescorer/QueryRescorer API to perform second-pass rescoring or reranking of search results using more expensive scoring functions after first-pass hit collection.

  • AnalyzingInfixSuggester now supports near-real-time autosuggest.

  • Simplified impact-sorted postings (using SortingMergePolicy and EarlyTerminatingCollector) to use Lucene's Sort class to express the sort order.

  • Bulk scoring and normal iterator-based scoring were separated, so some queries can do bulk scoring more effectively.

  • Switched to MurmurHash3 to hash terms during indexing.

  • IndexWriter now supports updating of binary doc value fields.

  • HunspellStemFilter now uses 10 to 100x less RAM. It also loads all known OpenOffice dictionaries without error.

  • Lucene now also fsyncs the directory metadata on commits, if the operating system and file system allow it (Linux, MacOSX are known to work).

  • Lucene now uses Java 7 file system functions under the hood, so index files can be deleted on Windows, even when readers are still open.

  • A serious bug in NativeFSLockFactory was fixed, which could allow multiple IndexWriters to acquire the same lock. The lock file is no longer deleted from the index directory even when the lock is not held.

  • Various bugfixes and optimizations since the 4.7.2 release.

15 April 2014 - Lucene Core 4.7.2 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.7.2

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

Lucene 4.7.2 includes 2 bug fixes, including a possible index corruption with near-realtime search.

See the CHANGES.txt file included with the release for a full list of changes and further details.

02 April 2014 - Lucene Core 4.7.1 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.7.1

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

Lucene 4.7.1 includes 14 bug fixes; one build improvement; and one change in runtime behavior: AutomatonQuery.equals is no longer implemented as "accepts same language".

See the CHANGES.txt file included with the release for a full list of changes and further details.

12 March 2014 - Apache Lucene 4.8 will require Java 7

The Apache Lucene committers decided with a large majority on the vote to require Java 7 for the next minor release of Apache Lucene (version 4.8)!

The next release will also contain some improvements for Java 7:

  • Better file handling (especially on Windows) in the directory implementations. Files can now be deleted on windows, although the index is still open - like it was always possible on Unix environments (delete on last close semantics).

  • Speed improvements in sorting comparators: Sorting now uses Java 7's own comparators for integer and long sorts, which are highly optimized by the Hotspot VM.

If you want to stay up-to-date with Lucene and Solr, you should upgrade your infrastructure to Java 7. Please be aware that you must use at least use Java 7u1. The recommended version at the moment is Java 7u25. Later versions like 7u40, 7u45,... have a bug causing index corrumption. Ideally use the Java 7u60 prerelease, which has fixed this bug. Once 7u60 is out, this will be the recommended version. In addition, there is no more Oracle/BEA JRockit available for Java 7, use the official Oracle Java 7. JRockit was never working correctly with Lucene/Solr (causing index corrumption), so this should not be an issue. Please also review our list of JVM bugs: http://wiki.apache.org/lucene-java/JavaBugs

EDIT (as of 15 April 2014): The recently released Java 7u55 fixes the above bug causing index corrumption. This version is now the recommended version for running Apache Lucene.

26 February 2014 - Lucene Core 4.7 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.7

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

See the CHANGES.txt file included with the release for a full list of details.

Lucene 4.7 Release Highlights:

  • When sorting by String (SortField.STRING), you can now specify whether missing values should be sorted first (the default), or last.

  • Add two memory resident dictionaries (FST terms dictionary and FSTOrd terms dictionary) to improve primary key lookups. The PostingsBaseFormat API is also changed so that term dictionaries get the ability to block encode term metadata, and all dictionary implementations can now plug in any PostingsBaseFormat.

  • NRT support for file systems that do not have delete on last close or cannot delete while referenced semantics.

  • Add LongBitSet for managing more than 2.1B bits (otherwise use FixedBitSet).

  • Speed up Lucene range faceting from O(N) per hit to O(log(N)) per hit using segment trees.

  • Add SearcherTaxonomyManager over search and taxonomy index directories (i.e. not only NRT).

  • Drilling down or sideways on a Lucene facet range (using Range.getFilter()) is now faster for costly filters (uses random access, not iteration); range facet counts now accept a fast-match filter to avoid computing the value for documents that are out of bounds, e.g. using a bounding box filter with distance range faceting.

  • Add Analyzer for Kurdish.

  • Add Payload support to FileDictionary (Suggest) and make it more configurable.

  • Add a new BlendedInfixSuggester, which is like AnalyzingInfixSuggester but boosts suggestions that matched tokens with lower positions.

  • Add SimpleQueryParser: parser for human-entered queries.

  • Add multitermquery (wildcards,prefix,etc) to PostingsHighlighter.

  • Upgrade to Spatial4j 0.4.1: Parses WKT (including ENVELOPE) with extension BUFFER; buffering a point results in a Circle. JTS isn't needed for WKT any more but remains required for Polygons. New Shapes: ShapeCollection and BufferedLineString.

  • Add spatial SerializedDVStrategy that serializes a binary representation of a shape into BinaryDocValues. It supports exact geometry relationship calculations.

  • Various bugfixes and optimizations since the 4.6.1 release.

28 January 2014 - Lucene Core 4.6.1 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.6.1

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains a handful of bug fixes. The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

See the CHANGES.txt file included with the release for a full list of details.

24 November 2013 - Lucene Core 4.6 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.6

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

See the CHANGES.txt file included with the release for a full list of details.

Lucene 4.6 Release Highlights:

  • Added support for NumericDocValues field updates (without re-indexing the document) through IndexWriter.updateNumericDocValue(Term, String, Long).

  • New FreeTextSuggester can predict the next word using a simple ngram language model useful for "long tail" suggestions.

  • A new expression module allows for customized ranking with script-like syntax.

  • A new DirectDocValuesFormat can hold all doc values in heap as uncompressed java native arrays.

  • Term.hasFreqs can now determine if a given field indexed per-doc term frequencies.

  • Various bugfixes and optimizations since the 4.5.1 release.

24 October 2013 - Lucene Core 4.5.1 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.5.1

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains a handful of bug fixes. The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

See the CHANGES.txt file included with the release for a full list of details.

Lucene 4.5.1 Release Highlights:

  • Lucene 4.5.1 includes 8 bug fixes.

5 October 2013 - Lucene Core 4.5 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.5

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

See the CHANGES.txt file included with the release for a full list of details.

Lucene 4.5 Release Highlights:

  • Added support for missing values to DocValues fields through AtomicReader.getDocsWithField.

  • Lucene 4.5 has a new Lucene45Codec with Lucene45DocValues, supporting missing values and with most datastructures residing off-heap.

  • New in-memory DocIdSet implementations which are especially better than FixedBitSet on small sets: WAH8DocIdSet, PFORDeltaDocIdSet and EliasFanoDocIdSet.

  • CachingWrapperFilter now caches filters with WAH8DocIdSet by default, which has the same memory usage as FixedBitSet in the worst case but is smaller and faster on small sets.

  • TokenStreams now set the position increment in end(), so we can handle trailing holes.

  • IndexWriter no longer clones the given IndexWriterConfig.

  • Various bugfixes and optimizations since the 4.4 release.