org.apache.lucene.search.suggest.analyzing
Class AnalyzingInfixSuggester

java.lang.Object
  extended by org.apache.lucene.search.suggest.Lookup
      extended by org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggester
All Implemented Interfaces:
Closeable

public class AnalyzingInfixSuggester
extends Lookup
implements Closeable

Analyzes the input text and then suggests matches based on prefix matches to any tokens in the indexed text. This also highlights the tokens that match.

This just uses an ordinary Lucene index. It supports payloads, and records these as a BinaryDocValues field. Matches are sorted only by the suggest weight; it would be nice to support blended score + weight sort in the future. This means this suggester best applies when there is a strong apriori ranking of all the suggestions.


Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.lucene.search.suggest.Lookup
Lookup.LookupPriorityQueue, Lookup.LookupResult
 
Field Summary
static int DEFAULT_MIN_PREFIX_CHARS
          Default minimum number of leading characters before PrefixQuery is used (4).
protected  IndexSearcher searcher
          IndexSearcher used for lookups.
protected static String TEXT_FIELD_NAME
          Field name used for the indexed text.
 
Fields inherited from class org.apache.lucene.search.suggest.Lookup
CHARSEQUENCE_COMPARATOR
 
Constructor Summary
AnalyzingInfixSuggester(Version matchVersion, File indexPath, Analyzer analyzer)
          Create a new instance, loading from a previously built directory, if it exists.
AnalyzingInfixSuggester(Version matchVersion, File indexPath, Analyzer indexAnalyzer, Analyzer queryAnalyzer, int minPrefixChars)
          Create a new instance, loading from a previously built directory, if it exists.
 
Method Summary
protected  void addPrefixMatch(StringBuilder sb, String surface, String analyzed, String prefixToken)
          Append a matched prefix token, to the provided StringBuilder.
protected  void addWholeMatch(StringBuilder sb, String surface, String analyzed)
          Appends the whole matched token to the provided StringBuilder.
 void build(TermFreqIterator iter)
          Builds up a new internal Lookup representation based on the given TermFreqIterator.
 void close()
           
protected  Query finishQuery(BooleanQuery in, boolean allTermsRequired)
          Subclass can override this to tweak the Query before searching.
protected  Directory getDirectory(File path)
          Subclass can override to choose a specific Directory implementation.
protected  IndexWriterConfig getIndexWriterConfig(Version matchVersion, Analyzer indexAnalyzer)
          Override this to customize index settings, e.g.
protected  Query getLastTokenQuery(String token)
          This is called if the last token isn't ended (e.g.
 boolean load(InputStream out)
          Discard current lookup data and load it from a previously saved copy.
 List<Lookup.LookupResult> lookup(CharSequence key, boolean onlyMorePopular, int num)
          Look up a key and return possible completion for this key.
 List<Lookup.LookupResult> lookup(CharSequence key, int num, boolean allTermsRequired, boolean doHighlight)
          Retrieve suggestions, specifying whether all terms must match (allTermsRequired) and whether the hits should be highlighted (doHighlight).
 boolean store(OutputStream out)
          Persist the constructed lookup data to a directory.
 
Methods inherited from class org.apache.lucene.search.suggest.Lookup
build
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

TEXT_FIELD_NAME

protected static final String TEXT_FIELD_NAME
Field name used for the indexed text.

See Also:
Constant Field Values

searcher

protected IndexSearcher searcher
IndexSearcher used for lookups.


DEFAULT_MIN_PREFIX_CHARS

public static final int DEFAULT_MIN_PREFIX_CHARS
Default minimum number of leading characters before PrefixQuery is used (4).

See Also:
Constant Field Values
Constructor Detail

AnalyzingInfixSuggester

public AnalyzingInfixSuggester(Version matchVersion,
                               File indexPath,
                               Analyzer analyzer)
                        throws IOException
Create a new instance, loading from a previously built directory, if it exists.

Throws:
IOException

AnalyzingInfixSuggester

public AnalyzingInfixSuggester(Version matchVersion,
                               File indexPath,
                               Analyzer indexAnalyzer,
                               Analyzer queryAnalyzer,
                               int minPrefixChars)
                        throws IOException
Create a new instance, loading from a previously built directory, if it exists.

Parameters:
minPrefixChars - Minimum number of leading characters before PrefixQuery is used (default 4). Prefixes shorter than this are indexed as character ngrams (increasing index size but making lookups faster).
Throws:
IOException
Method Detail

getIndexWriterConfig

protected IndexWriterConfig getIndexWriterConfig(Version matchVersion,
                                                 Analyzer indexAnalyzer)
Override this to customize index settings, e.g. which codec to use.


getDirectory

protected Directory getDirectory(File path)
                          throws IOException
Subclass can override to choose a specific Directory implementation.

Throws:
IOException

build

public void build(TermFreqIterator iter)
           throws IOException
Description copied from class: Lookup
Builds up a new internal Lookup representation based on the given TermFreqIterator. The implementation might re-sort the data internally.

Specified by:
build in class Lookup
Throws:
IOException

lookup

public List<Lookup.LookupResult> lookup(CharSequence key,
                                        boolean onlyMorePopular,
                                        int num)
Description copied from class: Lookup
Look up a key and return possible completion for this key.

Specified by:
lookup in class Lookup
Parameters:
key - lookup key. Depending on the implementation this may be a prefix, misspelling, or even infix.
onlyMorePopular - return only more popular results
num - maximum number of results to return
Returns:
a list of possible completions, with their relative weight (e.g. popularity)

getLastTokenQuery

protected Query getLastTokenQuery(String token)
                           throws IOException
This is called if the last token isn't ended (e.g. user did not type a space after it). Return an appropriate Query clause to add to the BooleanQuery.

Throws:
IOException

lookup

public List<Lookup.LookupResult> lookup(CharSequence key,
                                        int num,
                                        boolean allTermsRequired,
                                        boolean doHighlight)
Retrieve suggestions, specifying whether all terms must match (allTermsRequired) and whether the hits should be highlighted (doHighlight).


finishQuery

protected Query finishQuery(BooleanQuery in,
                            boolean allTermsRequired)
Subclass can override this to tweak the Query before searching.


addWholeMatch

protected void addWholeMatch(StringBuilder sb,
                             String surface,
                             String analyzed)
Appends the whole matched token to the provided StringBuilder.


addPrefixMatch

protected void addPrefixMatch(StringBuilder sb,
                              String surface,
                              String analyzed,
                              String prefixToken)
Append a matched prefix token, to the provided StringBuilder.

Parameters:
sb - StringBuilder to append to
surface - The fragment of the surface form (indexed during build(org.apache.lucene.search.spell.TermFreqIterator), corresponding to this match
analyzed - The analyzed token that matched
prefixToken - The prefix of the token that matched

store

public boolean store(OutputStream out)
Description copied from class: Lookup
Persist the constructed lookup data to a directory. Optional operation.

Specified by:
store in class Lookup
Parameters:
out - OutputStream to write the data to.
Returns:
true if successful, false if unsuccessful or not supported.

load

public boolean load(InputStream out)
Description copied from class: Lookup
Discard current lookup data and load it from a previously saved copy. Optional operation.

Specified by:
load in class Lookup
Parameters:
out - the InputStream to load the lookup data.
Returns:
true if completed successfully, false if unsuccessful or not supported.

close

public void close()
           throws IOException
Specified by:
close in interface Closeable
Throws:
IOException


Copyright © 2000-2013 Apache Software Foundation. All Rights Reserved.