| 
 | ||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.lucene.search.similar.SimilarityQueries
public final class SimilarityQueries
Simple similarity measures.
MoreLikeThis| Method Summary | |
|---|---|
| static Query | formSimilarQuery(String body,
                 Analyzer a,
                 String field,
                 Set stop)Simple similarity query generators. | 
| Methods inherited from class java.lang.Object | 
|---|
| clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait | 
| Method Detail | 
|---|
public static Query formSimilarQuery(String body,
                                     Analyzer a,
                                     String field,
                                     Set stop)
                              throws IOException
IndexSearcher for similar docs.
 The only caveat is the first hit returned should be your source document - you'll
 need to then ignore that.
 
 So, if you have a code fragment like this:
 
 
 Query q = formSimilaryQuery( "I use Lucene to search fast. Fast searchers are good", new StandardAnalyzer(), "contents", null);
 
 
 The query returned, in string form, will be '(i use lucene to search fast searchers are good').
 
The philosophy behind this method is "two documents are similar if they share lots of words". Note that behind the scenes, Lucene's scoring algorithm will tend to give two documents a higher similarity score if the share more uncommon words.
 This method is fail-safe in that if a long 'body' is passed in and
 BooleanQuery.add() (used internally)
 throws
 BooleanQuery.TooManyClauses, the
 query as it is will be returned.
body - the body of the document you want to find similar documents toa - the analyzer to use to parse the bodyfield - the field you want to search on, probably something like "contents" or "body"stop - optional set of stop words to ignore
IOException - this can't happen...| 
 | ||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||