org.apache.lucene.classification
Class SimpleNaiveBayesClassifier

java.lang.Object
  extended by org.apache.lucene.classification.SimpleNaiveBayesClassifier
All Implemented Interfaces:
Classifier<BytesRef>

public class SimpleNaiveBayesClassifier
extends Object
implements Classifier<BytesRef>

A simplistic Lucene based NaiveBayes classifier, see http://en.wikipedia.org/wiki/Naive_Bayes_classifier

WARNING: This API is experimental and might change in incompatible ways in the next release.

Constructor Summary
SimpleNaiveBayesClassifier()
          Creates a new NaiveBayes classifier.
 
Method Summary
 ClassificationResult<BytesRef> assignClass(String inputDocument)
          Assign a class (with score) to the given text String
 void train(AtomicReader atomicReader, String textFieldName, String classFieldName, Analyzer analyzer)
          Train the classifier using the underlying Lucene index
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SimpleNaiveBayesClassifier

public SimpleNaiveBayesClassifier()
Creates a new NaiveBayes classifier. Note that you must call train() before you can classify any documents.

Method Detail

train

public void train(AtomicReader atomicReader,
                  String textFieldName,
                  String classFieldName,
                  Analyzer analyzer)
           throws IOException
Train the classifier using the underlying Lucene index

Specified by:
train in interface Classifier<BytesRef>
Parameters:
atomicReader - the reader to use to access the Lucene index
textFieldName - the name of the field used to compare documents
classFieldName - the name of the field containing the class assigned to documents
analyzer - the analyzer used to tokenize / filter the unseen text
Throws:
IOException - If there is a low-level I/O error.

assignClass

public ClassificationResult<BytesRef> assignClass(String inputDocument)
                                           throws IOException
Assign a class (with score) to the given text String

Specified by:
assignClass in interface Classifier<BytesRef>
Parameters:
inputDocument - a String containing text to be classified
Returns:
a ClassificationResult holding assigned class of type T and score
Throws:
IOException - If there is a low-level I/O error.


Copyright © 2000-2013 Apache Software Foundation. All Rights Reserved.