FSTCompletionLookup instead.@Deprecated public class FSTLookup extends Lookup
Lookup query
suggestion/ autocomplete interface.
The construction step in build(TermFreqIterator) works as follows:
buckets).
Note that this means that minor changes in weights may be lost during automaton construction.
In general, this is not a big problem because the "priorities" of completions can be split
into a fixed set of classes (even as rough as: very frequent, frequent, baseline, marginal).
If you need exact, fine-grained weights, use TSTLookup instead.abc with a discretized weight equal '1' would
become 1abc.FST) is constructed from the input. The root node has
arcs labeled with all possible weights. We cache all these arcs, highest-weight first.At runtime, in lookup(CharSequence, boolean, int), the automaton is utilized as follows:
The algorithm described above is optimized for finding suggestions to short prefixes in a top-weights-first order. This is probably the most common use case: it allows presenting suggestions early and sorts them by the global frequency (and then alphabetically).
If there is an exact match in the automaton, it is returned first on the results list (even with by-weight sorting).
Note that the maximum lookup time for any prefix is the time of descending to the subtree, plus traversal of the subtree up to the number of requested suggestions (because they are already presorted by weight on the root level and alphabetically at any node level).
To order alphabetically only (no ordering by priorities), use identical term weights for all terms. Alphabetical suggestions are returned even if non-constant weights are used, but the algorithm for doing this is suboptimal.
"alphabetically" in any of the documentation above indicates utf16 codepoint order, nothing else.
Lookup.LookupPriorityQueue, Lookup.LookupResult| Modifier and Type | Field and Description |
|---|---|
static String |
FILENAME
Deprecated.
Serialized automaton file name (storage).
|
CHARSEQUENCE_COMPARATOR| Constructor and Description |
|---|
FSTLookup()
Deprecated.
|
FSTLookup(int buckets,
boolean exactMatchFirst)
Deprecated.
|
| Modifier and Type | Method and Description |
|---|---|
void |
build(TermFreqIterator tfit)
Deprecated.
Builds up a new internal
Lookup representation based on the given TermFreqIterator. |
Float |
get(CharSequence key)
Deprecated.
Get the (approximated) weight of a single key (if there is a perfect match
for it in the automaton).
|
boolean |
load(InputStream input)
Deprecated.
Discard current lookup data and load it from a previously saved copy.
|
List<Lookup.LookupResult> |
lookup(CharSequence key,
boolean onlyMorePopular,
int num)
Deprecated.
Lookup autocomplete suggestions to
key. |
boolean |
store(OutputStream output)
Deprecated.
Persist the constructed lookup data to a directory.
|
public static final String FILENAME
public FSTLookup()
public FSTLookup(int buckets,
boolean exactMatchFirst)
public void build(TermFreqIterator tfit) throws IOException
LookupLookup representation based on the given TermFreqIterator.
The implementation might re-sort the data internally.build in class LookupIOExceptionpublic Float get(CharSequence key)
null
if not found.public List<Lookup.LookupResult> lookup(CharSequence key, boolean onlyMorePopular, int num)
key.lookup in class Lookupkey - The prefix to which suggestions should be sought.onlyMorePopular - Return most popular suggestions first. This is the default
behavior for this implementation. Setting it to false has no effect (use
constant term weights to sort alphabetically only).num - At most this number of suggestions will be returned.public boolean store(OutputStream output) throws IOException
Lookupstore in class Lookupoutput - OutputStream to write the data to.IOException - when fatal IO error occurs.public boolean load(InputStream input) throws IOException
Lookupload in class Lookupinput - the InputStream to load the lookup data.IOException - when fatal IO error occurs.