LuceneTM Core News

20 September 2016, Apache Lucene™ 6.2.1 available

The Lucene PMC is pleased to announce the release of Apache Lucene 6.2.1

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

Highlights of this Lucene release include:

  • LUCENE-7417: The standard Highlighter could throw an !IllegalArgumentException when trying to highlight a query containing a degenerate case of a !MultiPhraseQuery with one term.

  • LUCENE-7440: Document id skipping (!PostingsEnum.advance) could throw an !ArrayIndexOutOfBoundsException exception on large index segments (>1.8B docs) with large skips.

  • LUCENE-7318: Fix backwards compatibility issues around StandardAnalyzer and its components, introduced with Lucene 6.2.0. The moved classes were restored in their original packages: LowercaseFilter and StopFilter, as well as several utility classes.

The release is available for immediate download at:

http://www.apache.org/dyn/closer.lua/lucene/java/6.2.1

See the CHANGES.txt file included with the release for a full list of changes and further details.

09 September 2016 - Apache Lucene 5.5.3 available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.5.3

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at:

http://www.apache.org/dyn/closer.lua/lucene/java/5.5.3

See the CHANGES.txt file included with the release for a full list of changes and further details.

25 August 2016, Apache Lucene 6.2.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 6.2.0

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

Please read CHANGES.txt for a full list of new features and changes: https://lucene.apache.org/core/6_2_0/changes/Changes.html

Highlights of this Lucene release include:

  • The CREATE_NEW flag is passed when creating a file to ensure Lucene is really write-once

  • Index numeric ranges (min and max value in a single field) and search by overlapping range

  • IndexWriter methods return a sequence number indicating effective order of operations across threads

  • UkrainianMorfologikAnalyzer is a new dictionary based analyzer for the Ukrainian language

  • The Polygon class can now be created from a GeoJSON string

  • Compound file creation now verifies checksum of its component files

  • Index time sorting is now a core feature, and supports dimensional points

  • StandardAnalyzer is moved to core and is the default analyzer

  • MatchNoDocsQuery now includes the reason it was created

  • QueryParser can now be told to not pre-split on whitespace

  • MMapDirectory tries harder to prevent SIGSEGV if buggy code tries to execute searches after the index was closed, but it's still best effort

  • MMapDirectory no longer allocates weak references to ease garbage collection

  • Conjunction (MUST, FILTER) queries are faster

  • Dimensional points have much faster (~40%) flush time and use less space in the index

25 June 2016, Apache Lucene 5.5.2 available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.5.2

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains 11 bug fixes since the 5.5.1 release.

The release is available for immediate download at:

http://www.apache.org/dyn/closer.lua/lucene/java/5.5.2

See the CHANGES.txt file included with the release for a full list of changes and further details.

17 June 2016, Apache Lucene 6.1.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 6.1.0.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

Please read CHANGES.txt for a full list of new features and changes: https://lucene.apache.org/core/6_1_0/changes/Changes.html

Lucene 6.1.0 Release Highlights:

New features

  • Numerous improvements to LatLonPoint, for indexing a latitude/longitude point and searching by polygon, distance or box, or finding nearest neighbors

  • Geo3D now has simple APIs for creating common shape queries, matching LatLonPoint

Optimizations

  • Faster indexing and searching of points.

  • Faster geo-spatial indexing and searching for LatLonPoint, Geo3D and GeoPoint (see http://home.apache.org/~mikemccand/geobench.html )

  • HardlinkCopyDirectoryWrapper optimizes file copies using hard links

  • In case of contention, the query cache now prefers returning an uncached Scorer rather than waiting on a lock.

Bug fixes

  • BooleanQuery could sometimes assign too low scores to ranges of documents that matched a single clause.

  • Doc values updates could sometimes be applied in the wrong order.

28 May 2016, Apache Lucene 6.0.1 available

The Lucene PMC is pleased to announce the release of Apache Lucene 6.0.1

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains 10 bug fixes since the 6.0.0 release, and one new feature:

  • Spatial-extras DateRangePrefixTree's Calendar is now configurable, to e.g. clear the Gregorian Change Date. Also, toString(cal) is now identical to DateTimeFormatter.ISO_INSTANT.

The release is available for immediate download at:

http://www.apache.org/dyn/closer.lua/lucene/java/6.0.1

See the CHANGES.txt file included with the release for a full list of changes and further details.

5 May 2016 - Apache Lucene 5.5.1 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.5.1

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains one bug fix since the 5.5.0 release. The release is available for immediate download at:

http://www.apache.org/dyn/closer.lua/lucene/core/5.5.1

See the CHANGES.txt file included with the release for a full list of changes and further details.

8 April 2016 - Apache Lucene 6.0.0 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 6.0.0.

The release can be downloaded from http://lucene.apache.org/core/mirrors-core-latest-redir.html

Release Highlights:

  • Java 8 is the minimum Java version required.

  • Dimensional points, replacing legacy numeric fields, provides fast and space-efficient support for both single- and multi-dimension range and shape filtering. This includes numeric (int, float, long, double), InetAddress, BigInteger and binary range filtering, as well as geo-spatial shape search over indexed 2D LatLonPoints. See this blog post for details. Dependent classes and modules (e.g., MemoryIndex, Spatial Strategies, Join module) have been refactored to use new point types.

  • Lucene classification module now works on Lucene Documents using a KNearestNeighborClassifier or SimpleNaiveBayesClassifier.

  • The spatial module no longer depends on third-party libraries. Previous spatial classes have been moved to a new spatial-extras module.

  • Spatial4j has been updated to a new 0.6 version hosted by locationtech.

  • TermsQuery performance boost by a more aggressive default query caching policy.

  • IndexSearcher's default Similarity is now changed to BM25Similarity.

  • Easier method of defining custom CharTokenizer instances.

22 February 2016 - Apache Lucene 5.5.0 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.5.0

The release can be downloaded from http://lucene.apache.org/core/mirrors-core-latest-redir.html

Release highlights:

  • JoinUtil.createJoinQuery can now join on numeric doc values fields

  • BlendedInfixSuggester now has an exponential reciprocal scoring model, to more strongly favor suggestions with matches closer to the beginning

  • CustomAnalyzer has improved (compile time) type safety

  • DFISimilarity implements the divergence from independence scoring model

  • Fully wrap any other merge policy using MergePolicyWrapper

  • Sandbox geo point queries have graduated into the spatial module, and now use a more efficient binary term encoding for smaller index size, faster indexing, and decreased search-time heap usage

  • BooleanQuery performs some new query optimizations

  • TermsQuery constructors are more GC efficient

24 September 2015 - Apache Lucene 5.3.1 and Apache Solr 5.3.1 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.3.1

The release can be downloaded from http://lucene.apache.org/core/mirrors-core-latest-redir.html

Highlights of this Lucene release include:

Bug Fixes

  • Remove classloader hack in MorfologikFilter
  • UsageTrackingQueryCachingPolicy no longer caches trivial queries like MatchAllDocsQuery
  • Fixed BoostingQuery to rewrite wrapped queries

23 January 2016 - Apache Lucene 5.3.2 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.3.2

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains one bug fix since the 5.3.1 release. The release is available for immediate download at:

http://www.apache.org/dyn/closer.lua/lucene/core/5.3.2

See the CHANGES.txt file included with the release for a full list of changes and further details.

23 January 2016 - Apache Lucene 5.4.1 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.4.1

The release can be downloaded from http://lucene.apache.org/core/mirrors-core-latest-redir.html

This release contains an important fix for a corruption bug that was introduced in version 5.4.0. If you are on 5.4.0 and using BINARY, SORTED_NUMERIC or SORTED_SET doc values, upgrading to 5.4.1 is strongly recommended.

See the CHANGES.txt file included with the release for a full list of changes and further details.

14 December 2015 - Apache Lucene 5.4.0 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.4.0

The release can be downloaded from http://lucene.apache.org/core/mirrors-core-latest-redir.html

Highlights of this Lucene release include:

API Changes

  • Query.getBoost and Query.setBoost are deprecated in favour of the new BoostQuery
  • The Filter class is deprecated in favour of FILTER clauses in a BooleanQuery
  • DefaultSimilarity has been renamed to ClassicSimilarity to prepare for the move to BM25 in Lucene 6

New features

  • New Serbian token filter
  • New DecimalDigitFilter, to fold unicode digits to latin digits
  • New UnicodeWhitespaceTokenizer, that uses Unicode's whitespace definition and splits on NBSP
  • New GeoPointDistanceRangeQuery to search for geo-points within a ring
  • Query caching is now enabled by default in IndexSearcher, use IndexSearcher.setQueryCache(null) to disable

Optimizations

  • MatchAllDocsQuery got faster
  • Doc values now use less memory for multi-valued fields and less disk in case of sparse fields
  • Two-phase iterators got a match cost API so that the costly bits can be checked last

Bug fixes

  • PatternTokenizer no longer hangs onto heap sized to the maximum input string it's ever seen.

24 September 2015 - Apache Lucene 5.3.1 and Apache Solr 5.3.1 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.3.1

The release can be downloaded from http://lucene.apache.org/core/mirrors-core-latest-redir.html

Highlights of this Lucene release include:

Bug Fixes

  • Remove classloader hack in MorfologikFilter
  • UsageTrackingQueryCachingPolicy no longer caches trivial queries like MatchAllDocsQuery
  • Fixed BoostingQuery to rewrite wrapped queries

24 August 2015, Apache Lucene™ 5.3.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.3.0

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at: http://www.apache.org/dyn/closer.lua/lucene/java/5.3.0

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

Lucene 5.3.0 Release Highlights:

API Changes

  • PhraseQuery and BooleanQuery are now immutable

New features

  • Added a new org.apache.lucene.search.join.CheckJoinIndex class that can be used to validate that an index has an appropriate structure to run join queries
  • Added a new BlendedTermQuery to blend statistics across several terms
  • New common suggest API that mirrors Lucene's Query/IndexSearcher APIs for Document based suggester.
  • IndexWriter can now be initialized from an already open near-real-time or non-NRT reader
  • Add experimental range tree doc values format and queries, based on a 1D version of the spatial BKD tree, for a faster and smaller alternative to postings-based numeric and binary term filtering. Range trees can also handle values larger than 64 bits.
  • Added GeoPointField, GeoPointInBBoxQuery, GeoPointInPolygonQuery for simple "indexed lat/lon point in bbox/shape" searching
  • Added experimental BKD geospatial tree doc values format and queries, for fast "bbox/polygon contains lat/lon points"
  • Use doc values to post-filter GeoPointField hits that fall in boundary cells, resulting in smaller index, faster searches and less heap used for each query

Optimizations

  • Reduce RAM usage of FieldInfos, and speed up lookup by number, by using an array instead of TreeMap except in very sparse cases
  • Faster intersection of the terms dictionary with very finite automata, which can be generated eg. by simple regexp queries
  • Various bugfixes and optimizations since the 5.2.0 release.

See the CHANGES.txt file included with the release for a full list of changes and further details.

15 June 2015, Apache Lucene™ 5.2.1 available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.2.1

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains various bug fixes and optimizations since the 5.2.0 release.

The release is available for immediate download at: http://www.apache.org/dyn/closer.lua/lucene/java/5.2.1

Lucene 5.2.1 includes 3 bug fixes:

  • Fix class loading deadlock relating to Codec initialization, default codec and SPI discovery.
  • NRT readers now reflect a new commit even if there is no change to the commit user data
  • Queries now get a dummy Similarity when scores are not needed in order to not load unnecessary information like norms

See the CHANGES.txt file included with the release for a full list of changes and further details.

7 June 2015 - Lucene Core 5.2.0 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.2.0

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at: http://www.apache.org/dyn/closer.lua/lucene/java/5.2.0

Lucene 5.2.0 release highlights:

  • Span queries now share document conjunction/intersection code with boolean queries, and use two-phased iterators for faster intersection by avoiding loading positions in certain cases.

  • Added two-phase support to SpanNotQuery, and SpanPositionCheckQuery and its subclasses: SpanPositionRangeQuery, SpanPayloadCheckQuery, SpanNearPayloadCheckQuery, SpanFirstQuery.

  • Added a new query time join to the join module that uses global ordinals, which is faster for subsequent joins between reopens.

  • New CompositeSpatialStrategy combines speed of RPT with accuracy of SDV. Includes optimized Intersect predicate to avoid many geometry checks. Uses TwoPhaseIterator.

  • New LimitTokenOffsetFilter that limits tokens to those before a configured maximum start offset.

  • New spatial PackedQuadPrefixTree, a generally more efficient choice than QuadPrefixTree, especially for high precision shapes. When used, you should typically disable RPT's pruneLeafyBranches option.

  • Expressions now support bindings keys that look like zero arg functions

  • Add SpanWithinQuery and SpanContainingQuery that return spans inside of / containing another spans.

  • New Spatial "Geo3d" API with partial Spatial4j integration. It is a set of shapes implemented using 3D planar geometry for calculating spatial relations on the surface of a sphere. Shapes include Point, BBox, Circle, Path (buffered line string), and Polygon.

  • Various bugfixes and optimizations since the 5.1.0 release.

See the CHANGES.txt file included with the release for a full list of changes and further details.

14 April 2015 - Lucene Core 5.1.0 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.1.0

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at: http://www.apache.org/dyn/closer.lua/lucene/java/5.1.0

Lucene 5.1.0 includes 9 new features, 10 bug fixes, and 24 optimizations / other changes from 18 unique contributors.

See the CHANGES.txt file included with the release for a full list of changes and further details.

5 March 2015 - Lucene Core 4.10.4 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.10.4

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at: http://www.apache.org/dyn/closer.lua/lucene/java/4.10.4

Lucene 4.10.4 includes 13 bug fixes.

See the CHANGES.txt file included with the release for a full list of changes and further details.

20 February 2015 - Lucene™ 5.0.0 core available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.0.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

Lucene 5.0 Release Highlights:

Stronger index safety

  • All file access now uses Java’s NIO.2 APIs which give Lucene stronger index safety in terms of better error handling and safer commits.

  • Every Lucene segment now stores a unique id per-segment and per-commit to aid in accurate replication of index files.

  • During merging, IndexWriter now always checks the incoming segments for corruption before merging. This can mean, on upgrading to 5.0.0, that merging may uncover long-standing latent corruption in an older 4.x index.

Reduced heap usage

  • Lucene now supports random-writable and advance-able sparse bitsets (RoaringDocIdSet and SparseFixedBitSet), so the heap required is in proportion to how many bits are set, not how many total documents exist in the index.

  • Heap usage during IndexWriter merging is also much lower with the new Lucene50Codec, since doc values and norms for the segments being merged are no longer fully loaded into heap for all fields; now they are loaded for the one field currently being merged, and then dropped.

  • The default norms format now uses sparse encoding when appropriate, so indices that enable norms for many sparse fields will see a large reduction in required heap at search time.

  • 5.0 has a new API to print a tree structure showing a recursive breakdown of which parts are using how much heap.

Other features

  • FieldCache is gone (moved to a dedicated UninvertingReader in the misc module). This means when you intend to sort on a field, you should index that field using doc values, which is much faster and less heap consuming than FieldCache.

  • Tokenizers and Analyzers no longer require Reader on init.

  • NormsFormat now gets its own dedicated NormsConsumer/Producer

  • SortedSetSortField, used to sort on a multi-valued field, is promoted from sandbox to Lucene's core.

  • PostingsFormat now uses a "pull" API when writing postings, just like doc values. This is powerful because you can do things in your postings format that require making more than one pass through the postings such as iterating over all postings for each term to decide which compression format it should use.

  • New DateRangeField type enables Indexing and searching of date ranges, particularly multi-valued ones.

  • A new ExitableDirectoryReader extends FilterDirectoryReader and enables exiting requests that take too long to enumerate over terms.

  • Suggesters from multi-valued field can now be built as DocumentDictionary now enumerates each value separately in a multi-valued field.

  • ConcurrentMergeScheduler detects whether the index is on SSD or not and does a better job defaulting its settings. This only works on Linux for now; other OS's will continue to use the previous defaults (tuned for spinning disks).

  • Auto-IO-throttling has been added to ConcurrentMergeScheduler, to rate limit IO writes for each merge depending on incoming merge rate.

  • CustomAnalyzer has been added that allows to configure analyzers like you do in Solr's index schema. This class has a builder API to configure Tokenizers, TokenFilters, and CharFilters based on their SPI names and parameters as documented by the corresponding factories.

  • Memory index now supports payloads.

  • Added a filter cache with a usage tracking policy that caches filters based on frequency of use.

  • The default codec has an option to control BEST_SPEED or BEST_COMPRESSION for stored fields.

  • Stored fields are merged more efficiently, especially when upgrading from previous versions or using SortingMergePolicy

NOTE: Lucene 5 no longer supports the Lucene 3.x index format. Opening indexes will result in IndexFormatTooOldException. It is recommended to either reindex all your data, or upgrade the old indexes with the IndexUpgrader tool of latest Lucene 4 version (4.10.x). Those indexes can then be read (see next section) with Lucene 5.

To read more about the changes, also see: http://blog.mikemccandless.com/2014/11/apache-lucene-500-is-coming.html

Please read CHANGES.txt and MIGRATE.txt for a full list of new features and notes on upgrading.

29 December 2014 - Lucene Core 4.10.3 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.10.3

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

Lucene 4.10.3 includes 12 bug fixes.

See the CHANGES.txt file included with the release for a full list of changes and further details, and Happy Holidays!

31 October 2014 - Lucene Core 4.10.2 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.10.2

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

Lucene 4.10.2 includes 2 bug fixes.

See the CHANGES.txt file included with the release for a full list of changes and further details, and Happy Halloween!

29 September 2014 - Lucene Core 4.10.1 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.10.1

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

Lucene 4.10.1 includes 7 bug fixes.

See the CHANGES.txt file included with the release for a full list of changes and further details.

22 September 2014 - Lucene Core 4.9.1 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.9.1

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

Lucene 4.9.1 includes 7 bug fixes.

See the CHANGES.txt file included with the release for a full list of changes and further details.

03 September 2014 - Lucene Core 4.10.0 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.10.0

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

See the CHANGES.txt file included with the release for a full list of details.

Lucene 4.10.0 Release Highlights:

  • New TermAutomatonQuery using an automaton for proximity queries. http://blog.mikemccandless.com/2014/08/a-new-proximity-query-for-lucene-using.html

  • New OrdsBlockTree terms dictionary supporting ord lookup.

  • Simplified matchVersion handling for Analyzers with new setVersion method, as well as Analyzer constructors not requiring Version.

  • Fixed possible corruption when opening a 3.x index with NRT reader.

  • Fixed edge case in StandardTokenizer that caused extremely slow parsing times with long text which partially matched grammar rules.

25 June 2014 - Lucene Core 4.9.0 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.9.0

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

See the CHANGES.txt file included with the release for a full list of details.

Lucene 4.9.0 Release Highlights:

  • New Terms.getMin/Max methods to retrieve the lowest and highest terms per field.

  • New IDVersionPostingsFormat, optimized for ID lookups that associate a monotonically increasing version per ID.

  • Atomic update of a set of doc values fields.

  • Numerous optimizations for doc values search-time performance.

  • New (default) Lucene49NormsFormat to better compress certain cases such as very short fields.

  • New SORTED_NUMERIC docvalues type for efficient processing of multi-valued numeric fields.

  • Indexer passes previous token stream for easier reuse.

  • MoreLikeThis accepts multiple values per field.

  • All classes that estimate their RAM usage now implement a new Accountable interface.

  • Lucene files are now written by (File)OutputStream on all platforms, completely disallowing seeking with simplified IO APIs.

  • Improve the confusing error message when MMapDirectory cannot create a new map.

20 May 2014 - Lucene Core 4.8.1 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.8.1

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

Lucene 4.8.1 includes 15 bug fixes.

See the CHANGES.txt file included with the release for a full list of changes and further details.

28 April 2014 - Apache Lucene 4.8.0 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.8.0

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

See the CHANGES.txt file included with the release for a full list of details.

Lucene 4.8.0 Release Highlights:

  • Apache Lucene now requires Java 7 or greater (recommended is Oracle Java 7 or OpenJDK 7, minimum update 55; earlier versions have known JVM bugs affecting Lucene).

  • Apache Lucene is fully compatible with Java 8.

  • All index files now store end-to-end checksums, which are now validated during merging and reading. This ensures that corruptions caused by any bit-flipping hardware problems or bugs in the JVM can be detected earlier. For full detection be sure to enable all checksums during merging (it's disabled by default).

  • Lucene has a new Rescorer/QueryRescorer API to perform second-pass rescoring or reranking of search results using more expensive scoring functions after first-pass hit collection.

  • AnalyzingInfixSuggester now supports near-real-time autosuggest.

  • Simplified impact-sorted postings (using SortingMergePolicy and EarlyTerminatingCollector) to use Lucene's Sort class to express the sort order.

  • Bulk scoring and normal iterator-based scoring were separated, so some queries can do bulk scoring more effectively.

  • Switched to MurmurHash3 to hash terms during indexing.

  • IndexWriter now supports updating of binary doc value fields.

  • HunspellStemFilter now uses 10 to 100x less RAM. It also loads all known OpenOffice dictionaries without error.

  • Lucene now also fsyncs the directory metadata on commits, if the operating system and file system allow it (Linux, MacOSX are known to work).

  • Lucene now uses Java 7 file system functions under the hood, so index files can be deleted on Windows, even when readers are still open.

  • A serious bug in NativeFSLockFactory was fixed, which could allow multiple IndexWriters to acquire the same lock. The lock file is no longer deleted from the index directory even when the lock is not held.

  • Various bugfixes and optimizations since the 4.7.2 release.

15 April 2014 - Lucene Core 4.7.2 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.7.2

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

Lucene 4.7.2 includes 2 bug fixes, including a possible index corruption with near-realtime search.

See the CHANGES.txt file included with the release for a full list of changes and further details.

02 April 2014 - Lucene Core 4.7.1 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.7.1

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

Lucene 4.7.1 includes 14 bug fixes; one build improvement; and one change in runtime behavior: AutomatonQuery.equals is no longer implemented as "accepts same language".

See the CHANGES.txt file included with the release for a full list of changes and further details.

12 March 2014 - Apache Lucene 4.8 will require Java 7

The Apache Lucene committers decided with a large majority on the vote to require Java 7 for the next minor release of Apache Lucene (version 4.8)!

The next release will also contain some improvements for Java 7:

  • Better file handling (especially on Windows) in the directory implementations. Files can now be deleted on windows, although the index is still open - like it was always possible on Unix environments (delete on last close semantics).

  • Speed improvements in sorting comparators: Sorting now uses Java 7's own comparators for integer and long sorts, which are highly optimized by the Hotspot VM.

If you want to stay up-to-date with Lucene and Solr, you should upgrade your infrastructure to Java 7. Please be aware that you must use at least use Java 7u1. The recommended version at the moment is Java 7u25. Later versions like 7u40, 7u45,... have a bug causing index corrumption. Ideally use the Java 7u60 prerelease, which has fixed this bug. Once 7u60 is out, this will be the recommended version. In addition, there is no more Oracle/BEA JRockit available for Java 7, use the official Oracle Java 7. JRockit was never working correctly with Lucene/Solr (causing index corrumption), so this should not be an issue. Please also review our list of JVM bugs: http://wiki.apache.org/lucene-java/JavaBugs

EDIT (as of 15 April 2014): The recently released Java 7u55 fixes the above bug causing index corrumption. This version is now the recommended version for running Apache Lucene.

26 February 2014 - Lucene Core 4.7 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.7

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at: http://lucene.apache.org/core/mirrors-core-latest-redir.html

See the CHANGES.txt file included with the release for a full list of details.

Lucene 4.7 Release Highlights:

  • When sorting by String (SortField.STRING), you can now specify whether missing values should be sorted first (the default), or last.

  • Add two memory resident dictionaries (FST terms dictionary and FSTOrd terms dictionary) to improve primary key lookups. The PostingsBaseFormat API is also changed so that term dictionaries get the ability to block encode term metadata, and all dictionary implementations can now plug in any PostingsBaseFormat.

  • NRT support for file systems that do not have delete on last close or cannot delete while referenced semantics.

  • Add LongBitSet for managing more than 2.1B bits (otherwise use FixedBitSet).

  • Speed up Lucene range faceting from O(N) per hit to O(log(N)) per hit using segment trees.

  • Add SearcherTaxonomyManager over search and taxonomy index directories (i.e. not only NRT).

  • Drilling down or sideways on a Lucene facet range (using Range.getFilter()) is now faster for costly filters (uses random access, not iteration); range facet counts now accept a fast-match filter to avoid computing the value for documents that are out of bounds, e.g. using a bounding box filter with distance range faceting.

  • Add Analyzer for Kurdish.

  • Add Payload support to FileDictionary (Suggest) and make it more configurable.

  • Add a new BlendedInfixSuggester, which is like AnalyzingInfixSuggester but boosts suggestions that matched tokens with lower positions.

  • Add SimpleQueryParser: parser for human-entered queries.

  • Add multitermquery (wildcards,prefix,etc) to PostingsHighlighter.

  • Upgrade to Spatial4j 0.4.1: Parses WKT (including ENVELOPE) with extension BUFFER; buffering a point results in a Circle. JTS isn't needed for WKT any more but remains required for Polygons. New Shapes: ShapeCollection and BufferedLineString.

  • Add spatial SerializedDVStrategy that serializes a binary representation of a shape into BinaryDocValues. It supports exact geometry relationship calculations.

  • Various bugfixes and optimizations since the 4.6.1 release.