SolrTM News

18 August 2014 - Recommendation to update Apache POI in Apache Solr 4.8.0, 4.8.1, and 4.9.0 installations

Apache Solr versions 4.8.0, 4.8.1, 4.9.0 bundle Apache POI 3.10-beta2 with its binary release tarball. This version (and all previous ones) of Apache POI are vulnerable to the following issues:

CVE-2014-3529: XML External Entity (XXE) problem in Apache POI's OpenXML parser

Information disclosure: Apache POI uses Java's XML components to parse OpenXML files produced by Microsoft Office products (DOCX, XLSX, PPTX,...). Applications that accept such files from end-users are vulnerable to XML External Entity (XXE) attacks, which allows remote attackers to bypass security restrictions and read arbitrary files via a crafted OpenXML document that provides an XML external entity declaration in conjunction with an entity reference.

CVE-2014-3574: XML Entity Expansion (XEE) problem in Apache POI's OpenXML parser

Denial of service: Apache POI uses Java's XML components and Apache Xmlbeans to parse OpenXML files produced by Microsoft Office products (DOCX, XLSX, PPTX,...). Applications that accept such files from end-users are vulnerable to XML Entity Expansion (XEE) attacks ("XML bombs"), which allows remote hackers to consume large amounts of CPU resources.

The Apache POI PMC released a bugfix version (3.10.1) today.

Solr users are affected by these issues, if they enable the "Apache Solr Content Extraction Library (Solr Cell)" contrib module from the folder "contrib/extraction" of the release tarball.

Users of Apache Solr are strongly advised to keep the module disabled if they don't use it. Alternatively, users of Apache Solr 4.8.0, 4.8.1, or 4.9.0 can update the affected libraries by replacing the vulnerable JAR files in the distribution folder. Users of previous versions have to update their Solr release first, patching older versions is impossible.

To replace the vulnerable JAR files follow these steps:

  • Download the Apache POI 3.10.1 binary release.

  • Unzip the archive.

  • Delete the following files in your "solr-4.X.X/contrib/extraction/lib" folder:

    • poi-3.10-beta2.jar
    • poi-ooxml-3.10-beta2.jar
    • poi-ooxml-schemas-3.10-beta2.jar
    • poi-scratchpad-3.10-beta2.jar
    • xmlbeans-2.3.0.jar
  • Copy the following files from the base folder of the Apache POI distribution to the "solr-4.X.X/contrib/extraction/lib" folder:

    • poi-3.10.1-20140818.jar
    • poi-ooxml-3.10.1-20140818.jar
    • poi-ooxml-schemas-3.10.1-20140818.jar
    • poi-scratchpad-3.10.1-20140818.jar
  • Copy "xmlbeans-2.6.0.jar" from POI's "ooxml-lib/" folder to the "solr-4.X.X/contrib/extraction/lib" folder.

  • Verify that the "solr-4.X.X/contrib/extraction/lib" no longer contains any files with version number "3.10-beta2".

  • Verify that the folder contains one xmlbeans JAR file with version 2.6.0.

If you just want to disable extraction of Microsoft Office documents, delete the files above and don't replace them. "Solr Cell" will automatically detect this and disable Microsoft Office document extraction.

Coming versions of Apache Solr will have the updated libraries bundled.

30 June 2014 - Apache Solr Ref Guide for 4.9 Available

The Lucene PMC is pleased to announce that there is a new version of the Solr Reference Guide for Solr 4.9.

The 408 page PDF serves as the definitive user's manual for Solr 4.9. It can be downloaded from the Apache mirror network: https://www.apache.org/dyn/closer.cgi/lucene/solr/ref-guide/.

25 June 2014 - Apache Solr 4.9.0 Available

The Lucene PMC is pleased to announce the release of Apache Solr 4.9.0

Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly scalable, providing fault tolerant distributed search and indexing, and powers the search and navigation features of many of the world's largest internet sites.

Solr 4.9.0 is available for immediate download at: http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

See the CHANGES.txt file included with the release for a full list of details.

Solr 4.9.0 Release Highlights:

  • Numerous optimizations for doc values search-time performance

  • Allow a client application to request the minium achieved replication factor for an update request (single or batch) by sending an optional parameter "min_rf".

  • Query re-ranking support with the new ReRankingQParserPlugin.

  • A new [child ...] DocTransformer for optionally including Block-Join decendent documents inline in the results of a search.

  • A new (default) Lucene49NormsFormat to better compress certain cases such as very short fields.

20 May 2014 - Apache Solr 4.8.1 Available

The Lucene PMC is pleased to announce the release of Apache Solr 4.8.1

Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly scalable, providing fault tolerant distributed search and indexing, and powers the search and navigation features of many of the world's largest internet sites.

Solr 4.8.1 is available for immediate download at: http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

Solr 4.8.1 includes 10 bug fixes, as well as Lucene 4.8.1 and its bug fixes.

See the CHANGES.txt file included with the release for a full list of details.

2 May 2014 - Apache Solr Ref Guide for 4.8 Available

The Lucene PMC is pleased to announce that there is a new version of the Solr Reference Guide available for Solr 4.8.

The 396 page PDF serves as the definitive user's manual for Solr 4.8. It can be downloaded from the Apache mirror network: https://www.apache.org/dyn/closer.cgi/lucene/solr/ref-guide/

28 April 2014 - Apache Solr 4.8.0 Available

The Lucene PMC is pleased to announce the release of Apache Solr 4.8.0

Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly scalable, providing fault tolerant distributed search and indexing, and powers the search and navigation features of many of the world's largest internet sites.

Solr 4.8.0 is available for immediate download at: http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

See the CHANGES.txt file included with the release for a full list of details.

Solr 4.8.0 Release Highlights:

  • Apache Solr now requires Java 7 or greater (recommended is Oracle Java 7 or OpenJDK 7, minimum update 55; earlier versions have known JVM bugs affecting Solr).

  • Apache Solr is fully compatible with Java 8.

  • <fields> and <types> tags have been deprecated from schema.xml. There is no longer any reason to keep them in the schema file, they may be safely removed. This allows intermixing of <fieldType>, <field> and <copyField> definitions if desired.

  • The new {!complexphrase} query parser supports wildcards, ORs etc. inside Phrase Queries.

  • New Collections API CLUSTERSTATUS action reports the status of collections, shards, and replicas, and also lists collection aliases and cluster properties.

  • Added managed synonym and stopword filter factories, which enable synonym and stopword lists to be dynamically managed via REST API.

  • JSON updates now support nested child documents, enabling {!child} and {!parent} block join queries.

  • Added ExpandComponent to expand results collapsed by the CollapsingQParserPlugin, as well as the parent/child relationship of nested child documents.

  • Long-running Collections API tasks can now be executed asynchronously; the new REQUESTSTATUS action provides status.

  • Added a hl.qparser parameter to allow you to define a query parser for hl.q highlight queries.

  • In Solr single-node mode, cores can now be created using named configsets.

  • New DocExpirationUpdateProcessorFactory supports computing an expiration date for documents from the "TTL" expression, as well as automatically deleting expired documents on a periodic basis.

Solr 4.8.0 also includes many other new features as well as numerous optimizations and bugfixes of the corresponding Apache Lucene release.

15 April 2014 - Apache Solr 4.7.2 Available

The Lucene PMC is pleased to announce the release of Apache Solr 4.7.2

Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly scalable, providing fault tolerant distributed search and indexing, and powers the search and navigation features of many of the world's largest internet sites.

Solr 4.7.2 is available for immediate download at: http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

Solr 4.7.2 includes 2 bug fixes, as well as Lucene 4.7.2 and its bug fixes.

See the CHANGES.txt file included with the release for a full list of details.

02 April 2014 - Apache Solr 4.7.1 Available

The Lucene PMC is pleased to announce the release of Apache Solr 4.7.1

Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly scalable, providing fault tolerant distributed search and indexing, and powers the search and navigation features of many of the world's largest internet sites.

Solr 4.7.1 is available for immediate download at: http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

Solr 4.7.1 includes 28 bug fixes and one new configuration setting, as well as Lucene 4.7.1 and its bug fixes.

See the CHANGES.txt file included with the release for a full list of details.

12 March 2014 - Apache Solr 4.8 will require Java 7

The Apache Solr committers decided with a large majority on the vote to require Java 7 for the next minor release of Apache Solr (version 4.8)!

The next release will also contain some improvements for Java 7:

  • Better file handling (especially on Windows) in the directory implementations. Files can now be deleted on windows, although the index is still open - like it was always possible on Unix environments (delete on last close semantics).

  • Speed improvements in sorting comparators: Sorting now uses Java 7's own comparators for integer and long sorts, which are highly optimized by the Hotspot VM.

If you want to stay up-to-date with Lucene and Solr, you should upgrade your infrastructure to Java 7. Please be aware that you must use at least use Java 7u1. The recommended version at the moment is Java 7u25. Later versions like 7u40, 7u45,... have a bug causing index corrumption. Ideally use the Java 7u60 prerelease, which has fixed this bug. Once 7u60 is out, this will be the recommended version. In addition, there is no more Oracle/BEA JRockit available for Java 7, use the official Oracle Java 7. JRockit was never working correctly with Lucene/Solr (causing index corrumption), so this should not be an issue. Please also review our list of JVM bugs: http://wiki.apache.org/lucene-java/JavaBugs

EDIT (as of 15 April 2014): The recently released Java 7u55 fixes the above bug causing index corrumption. This version is now the recommended version for running Apache Solr.

5 March 2014 - Apache Solr Ref Guide for 4.7 Available

The Lucene PMC is pleased to announce that there is a new version of the Solr Reference Guide available for Solr 4.7.

The 395 page PDF serves as the definitive user's manual for Solr 4.7. It can be downloaded from the Apache mirror network: https://www.apache.org/dyn/closer.cgi/lucene/solr/ref-guide/

26 February 2014 - Apache Solr 4.7.0 Available

The Lucene PMC is pleased to announce the release of Apache Solr 4.7

Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly scalable, providing fault tolerant distributed search and indexing, and powers the search and navigation features of many of the world's largest internet sites.

Solr 4.7 is available for immediate download at: http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

See the CHANGES.txt file included with the release for a full list of details.

Solr 4.7 Release Highlights:

  • A new migrate collection API to split all documents with a route key into another collection.

  • Added support for tri-level compositeId routing.

  • Admin UI - Added a new Files conf directory browser/file viewer.

  • Add a QParserPlugin for Lucene's SimpleQueryParser.

  • Suggest improvements: a new SuggestComponent that fully utilizes the Lucene suggester module; queries can now use multiple suggesters; Lucene's FreeTextSuggester and BlendedInfixSuggester are now supported.

  • New cursorMark request param for efficient deep paging of sorted result sets. See http://s.apache.org/cursorpagination

  • Add a Solr contrib that allows for building Solr indexes via Hadoop's MapReduce.

  • Upgrade to Spatial4j 0.4. Various new options are now exposed automatically for an RPT field type. See Spatial4j CHANGES & javadocs. https://github.com/spatial4j/spatial4j/blob/master/CHANGES.md

  • SSL support for SolrCloud.

Solr 4.7 also includes many other new features as well as numerous optimizations and bugfixes.

28 January 2014 - Apache Solr 4.6.1 Available

The Lucene PMC is pleased to announce the release of Apache Solr 4.6.1

Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly scalable, providing fault tolerant distributed search and indexing, and powers the search and navigation features of many of the world's largest internet sites.

Solr 4.6.1 contains nearly 30 bug fixes. The release is available for immediate download at: http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

See the CHANGES.txt file included with the release for a full list of details.

2 December 2013 - Apache Solr Reference Guide 4.6

The Lucene PMC is pleased to announce the release of the Apache Solr Reference Guide for Solr 4.6.

This 347 page PDF serves as the definitive users manual for Solr 4.6.

The Solr Reference Guide is available for download from the Apache mirror network.

24 November 2013 - Apache Solr 4.6 Available

The Lucene PMC is pleased to announce the release of Apache Solr 4.6

Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly scalable, providing fault tolerant distributed search and indexing, and powers the search and navigation features of many of the world's largest internet sites.

Solr 4.6 is available for immediate download at: http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

See the CHANGES.txt file included with the release for a full list of details.

Solr 4.6 Release Highlights:

  • Many improvements and enhancements for shard splitting options
  • New AnalyzingInfixLookupFactory to leverage the AnalyzingInfixSuggester
  • New CollapsingQParserPlugin for high performance field collapsing on high cardinality fields
  • New SolrJ APIs for collection management
  • New DocBasedVersionConstraintsProcessorFactory providing support for user configured doc-centric versioning rules
  • New default index format: Lucene46Codec
  • New EnumField type

Solr 4.6 also includes many other new features as well as numerous optimizations and bugfixes.

24 October 2013 - Apache Solr 4.5.1 Available

The Lucene PMC is pleased to announce the release of Apache Solr 4.5.1

Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly scalable, providing fault tolerant distributed search and indexing, and powers the search and navigation features of many of the world's largest internet sites.

Solr 4.5.1 contains a handful of bug fixes, including 2 that are considered quite severe. The release is available for immediate download at: http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

See the CHANGES.txt file included with the release for a full list of details.

5 October 2013 - Apache Solr 4.5 and Apache Solr Reference Guide 4.5 Available

The Lucene PMC is pleased to announce the release of Apache Solr 4.5 and the Apache Solr Reference Guide 4.5

Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly scalable, providing fault tolerant distributed search and indexing, and powers the search and navigation features of many of the world's largest internet sites.

Solr 4.5 is available for immediate download at: http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

See the CHANGES.txt file included with the release for a full list of details.

The Solr Reference Guide, a 338 page PDF that serves as the definitive users manual for Solr 4.5, is available for download from the Apache mirror network:

https://www.apache.org/dyn/closer.cgi/lucene/solr/ref-guide/

Solr 4.5 Release Highlights:

  • Custom sharding support, including the ability to shard by field.
  • DocValue improvements: single valued fields no longer require a default value, allowing dynamicFields to contain doc values, as well as sortMissingFirst and sortMissingLast on docValue fields.
  • Ability to store solr.xml in ZooKeeper.
  • Multithreaded faceting.
  • CloudSolrServer can now route updates directly to the appropriate shard leader.

Solr 4.5 also includes many other new features as well as numerous optimizations and bugfixes.