FSTCompletionLookup
instead.@Deprecated public class FSTLookup extends Lookup
Lookup
query
suggestion/ autocomplete interface.
The construction step in build(TermFreqIterator)
works as follows:
buckets
).
Note that this means that minor changes in weights may be lost during automaton construction.
In general, this is not a big problem because the "priorities" of completions can be split
into a fixed set of classes (even as rough as: very frequent, frequent, baseline, marginal).
If you need exact, finegrained weights, use TSTLookup
instead.abc
with a discretized weight equal '1' would
become 1abc
.FST
) is constructed from the input. The root node has
arcs labeled with all possible weights. We cache all these arcs, highestweight first.At runtime, in lookup(CharSequence, boolean, int)
, the automaton is utilized as follows:
The algorithm described above is optimized for finding suggestions to short prefixes in a topweightsfirst order. This is probably the most common use case: it allows presenting suggestions early and sorts them by the global frequency (and then alphabetically).
If there is an exact match in the automaton, it is returned first on the results list (even with byweight sorting).
Note that the maximum lookup time for any prefix is the time of descending to the subtree, plus traversal of the subtree up to the number of requested suggestions (because they are already presorted by weight on the root level and alphabetically at any node level).
To order alphabetically only (no ordering by priorities), use identical term weights for all terms. Alphabetical suggestions are returned even if nonconstant weights are used, but the algorithm for doing this is suboptimal.
"alphabetically" in any of the documentation above indicates utf16 codepoint order, nothing else.
Lookup.LookupPriorityQueue, Lookup.LookupResult
Modifier and Type  Field and Description 

static String 
FILENAME
Deprecated.
Serialized automaton file name (storage).

CHARSEQUENCE_COMPARATOR
Constructor and Description 

FSTLookup()
Deprecated.

FSTLookup(int buckets,
boolean exactMatchFirst)
Deprecated.

Modifier and Type  Method and Description 

void 
build(TermFreqIterator tfit)
Deprecated.
Builds up a new internal
Lookup representation based on the given TermFreqIterator . 
Float 
get(CharSequence key)
Deprecated.
Get the (approximated) weight of a single key (if there is a perfect match
for it in the automaton).

boolean 
load(InputStream input)
Deprecated.
Discard current lookup data and load it from a previously saved copy.

List<Lookup.LookupResult> 
lookup(CharSequence key,
boolean onlyMorePopular,
int num)
Deprecated.
Lookup autocomplete suggestions to
key . 
boolean 
store(OutputStream output)
Deprecated.
Persist the constructed lookup data to a directory.

public static final String FILENAME
public FSTLookup()
public FSTLookup(int buckets, boolean exactMatchFirst)
public void build(TermFreqIterator tfit) throws IOException
Lookup
Lookup
representation based on the given TermFreqIterator
.
The implementation might resort the data internally.build
in class Lookup
IOException
public Float get(CharSequence key)
null
if not found.public List<Lookup.LookupResult> lookup(CharSequence key, boolean onlyMorePopular, int num)
key
.lookup
in class Lookup
key
 The prefix to which suggestions should be sought.onlyMorePopular
 Return most popular suggestions first. This is the default
behavior for this implementation. Setting it to false
has no effect (use
constant term weights to sort alphabetically only).num
 At most this number of suggestions will be returned.public boolean store(OutputStream output) throws IOException
Lookup
store
in class Lookup
output
 OutputStream
to write the data to.IOException
 when fatal IO error occurs.public boolean load(InputStream input) throws IOException
Lookup
load
in class Lookup
input
 the InputStream
to load the lookup data.IOException
 when fatal IO error occurs.