Apache Mahout - Overview
Apache Lucene Mahout
Mahout's goal is to build scalable, Apache licensed machine learning libraries. Initially, we are interested in building out the ten machine learning libraries detailed in http://www.cs.stanford.edu/people/ang//papers/nips06-mapreducemulticore.pdf using Hadoop. While these algorithms are our initial focus, we welcome contributions of other machine learning approaches.
Interested in helping? See the Wiki or send us an email. Also note, we are just getting off the ground, so please be patient as we get the various infrastructure pieces in place.
Mahout News
07 April 2009 - Apache Mahout 0.1 released
The Apache Lucene project is pleased to announce the release of Apache Mahout 0.1. Apache Mahout is a subproject of Apache Lucene with the goal of delivering scalable machine learning algorithm implementations under the Apache license. The first public release includes implementations for clustering, classification, collaborative filtering and evolutionary programming.
Highlights include:
- Taste Collaborative Filtering
- Several distributed clustering implementations: k-Means, Fuzzy k-Means, Dirchlet, Mean-Shift and Canopy
- Distributed Naive Bayes and Complementary Naive Bayes classification implementations
- Distributed fitness function implementation for the Watchmaker evolutionary programming library
- Most implementations are built on top of Apache Hadoop (http://hadoop.apache.org) for scalability
Details on what's included can be found in the release notes.
Downloads are available from the Apache Mirrors
09 February 2009 - Lucene at ApacheCon Europe 2009 in Amsterdam
Lucene will be extremely well represented at
ApacheCon US 2009
in Amsterdam, Netherlands this March 23-27, 2009:
- Lucene Boot Camp - A two day training session, March 23 & 24th
- Solr Boot Camp - A one day training session, March 24th
- Introducing Apache Mahout - Grant Ingersoll. March 25th @ 10:30
- Lucene/Solr Case Studies - Erik Hatcher. March 25th @ 11:30
- Advanced Indexing Techniques with Apache Lucene - Michael Busch. March 25th @ 14:00
- Apache Solr - A Case Study - Uri Boness. March 26th @ 17:30
- Best of breed - httpd, forrest, solr and droids - Thorsten Scherler. March 27th @ 17:30
- Apache Droids - an intelligent standalone robot framework - Thorsten Scherler. March 26th @ 15:00
22 July 2008 - Lucene at ApacheCon New Orleans
Lucene will be extremely well represented at
ApacheCon US 2008
in New Orleans this November 3-7, 2008:
- Lucene Boot Camp - A two day training session, November 3rd & 4th
- Solr Boot Camp - A one day training session, November 4th
- An entire day of Lucene sessions, including a talk on Mahout by Mahout committer Grant Ingersoll, on November 5th
4 April 2008 - Mahout - Now with more Taste!
We are pleased to announce that the Taste Collaborative Filtering (Taste on SourceForge) has donated it's codebase to the Mahout project. In the coming weeks and months we will work to bring it into Mahout and then make it run on Hadoop, bringing truly large scale collaborative filtering capabilities to our users.
16 March 2008 - Google Summer Of Code Projects
The ASF is in the process of creating projects for Google's annual Summer of Code Project. Mahout has a number of people willing to be mentors, so if you are a student interested in working on machine learning algorithms using Hadoop, then please check out the ASF Summer of Code wiki page.
22 January 2008 - Mahout launches
The Lucene PMC announces the creation of the Mahout subproject.



