org.apache.lucene.search.suggest.fst
Class FSTLookup

java.lang.Object
  extended by org.apache.lucene.search.suggest.Lookup
      extended by org.apache.lucene.search.suggest.fst.FSTLookup

public class FSTLookup
extends Lookup

Finite state automata based implementation of Lookup query suggestion/ autocomplete interface.

Implementation details

The construction step in build(TermFreqIterator) works as follows:

At runtime, in lookup(String, boolean, int), the automaton is utilized as follows:

Runtime behavior and performance characteristic

The algorithm described above is optimized for finding suggestions to short prefixes in a top-weights-first order. This is probably the most common use case: it allows presenting suggestions early and sorts them by the global frequency (and then alphabetically).

If there is an exact match in the automaton, it is returned first on the results list (even with by-weight sorting).

Note that the maximum lookup time for any prefix is the time of descending to the subtree, plus traversal of the subtree up to the number of requested suggestions (because they are already presorted by weight on the root level and alphabetically at any node level).

To order alphabetically only (no ordering by priorities), use identical term weights for all terms. Alphabetical suggestions are returned even if non-constant weights are used, but the algorithm for doing this is suboptimal.

"alphabetically" in any of the documentation above indicates utf16 codepoint order, nothing else.


Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.lucene.search.suggest.Lookup
Lookup.LookupPriorityQueue, Lookup.LookupResult
 
Field Summary
static String FILENAME
          Serialized automaton file name (storage).
 
Constructor Summary
FSTLookup()
           
FSTLookup(int buckets, boolean exactMatchFirst)
           
 
Method Summary
 boolean add(String key, Object value)
          Not implemented.
 void build(TermFreqIterator tfit)
           
 Float get(String key)
          Get the (approximated) weight of a single key (if there is a perfect match for it in the automaton).
 boolean load(File storeDir)
          Deserialization from disk.
 List<Lookup.LookupResult> lookup(String key, boolean onlyMorePopular, int num)
          Lookup autocomplete suggestions to key.
 boolean store(File storeDir)
          Serialization to disk.
 
Methods inherited from class org.apache.lucene.search.suggest.Lookup
build
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

FILENAME

public static final String FILENAME
Serialized automaton file name (storage).

See Also:
Constant Field Values
Constructor Detail

FSTLookup

public FSTLookup()

FSTLookup

public FSTLookup(int buckets,
                 boolean exactMatchFirst)
Method Detail

build

public void build(TermFreqIterator tfit)
           throws IOException
Specified by:
build in class Lookup
Throws:
IOException

add

public boolean add(String key,
                   Object value)
Not implemented.

Specified by:
add in class Lookup
Parameters:
key - new lookup key
value - value to associate with this key
Returns:
true if new key is added, false if it already exists or operation is not supported.

get

public Float get(String key)
Get the (approximated) weight of a single key (if there is a perfect match for it in the automaton).

Specified by:
get in class Lookup
Parameters:
key - lookup key
Returns:
Returns the approximated weight of the input key or null if not found.

lookup

public List<Lookup.LookupResult> lookup(String key,
                                        boolean onlyMorePopular,
                                        int num)
Lookup autocomplete suggestions to key.

Specified by:
lookup in class Lookup
Parameters:
key - The prefix to which suggestions should be sought.
onlyMorePopular - Return most popular suggestions first. This is the default behavior for this implementation. Setting it to false has no effect (use constant term weights to sort alphabetically only).
num - At most this number of suggestions will be returned.
Returns:
Returns the suggestions, sorted by their approximated weight first (decreasing) and then alphabetically (utf16 codepoint order).

load

public boolean load(File storeDir)
             throws IOException
Deserialization from disk.

Specified by:
load in class Lookup
Parameters:
storeDir - directory where lookup data was stored.
Returns:
true if completed successfully, false if unsuccessful or not supported.
Throws:
IOException - when fatal IO error occurs.

store

public boolean store(File storeDir)
              throws IOException
Serialization to disk.

Specified by:
store in class Lookup
Parameters:
storeDir - directory where data can be stored.
Returns:
true if successful, false if unsuccessful or not supported.
Throws:
IOException - when fatal IO error occurs.


Copyright © 2000-2011 Apache Software Foundation. All Rights Reserved.