org.apache.lucene.classification
Class KNearestNeighborClassifier

java.lang.Object
  extended by org.apache.lucene.classification.KNearestNeighborClassifier
All Implemented Interfaces:
Classifier<BytesRef>

public class KNearestNeighborClassifier
extends Object
implements Classifier<BytesRef>

A k-Nearest Neighbor classifier (see http://en.wikipedia.org/wiki/K-nearest_neighbors) based on MoreLikeThis

WARNING: This API is experimental and might change in incompatible ways in the next release.

Constructor Summary
KNearestNeighborClassifier(int k)
          Create a Classifier using kNN algorithm
 
Method Summary
 ClassificationResult<BytesRef> assignClass(String text)
          Assign a class (with score) to the given text String
 void train(AtomicReader atomicReader, String textFieldName, String classFieldName, Analyzer analyzer)
          Train the classifier using the underlying Lucene index
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

KNearestNeighborClassifier

public KNearestNeighborClassifier(int k)
Create a Classifier using kNN algorithm

Parameters:
k - the number of neighbors to analyze as an int
Method Detail

assignClass

public ClassificationResult<BytesRef> assignClass(String text)
                                           throws IOException
Assign a class (with score) to the given text String

Specified by:
assignClass in interface Classifier<BytesRef>
Parameters:
text - a String containing text to be classified
Returns:
a ClassificationResult holding assigned class of type T and score
Throws:
IOException - If there is a low-level I/O error.

train

public void train(AtomicReader atomicReader,
                  String textFieldName,
                  String classFieldName,
                  Analyzer analyzer)
           throws IOException
Train the classifier using the underlying Lucene index

Specified by:
train in interface Classifier<BytesRef>
Parameters:
atomicReader - the reader to use to access the Lucene index
textFieldName - the name of the field used to compare documents
classFieldName - the name of the field containing the class assigned to documents
analyzer - the analyzer used to tokenize / filter the unseen text
Throws:
IOException - If there is a low-level I/O error.


Copyright © 2000-2013 Apache Software Foundation. All Rights Reserved.