public class FSTCompletionLookup extends Lookup implements Accountable
Lookup
API to FSTCompletion
.
This adapter differs from FSTCompletion
in that it attempts
to discretize any "weights" as passed from in InputIterator.weight()
to match the number of buckets. For the rationale for bucketing, see
FSTCompletion
.
Note:Discretization requires an additional sorting pass.
The range of weights for bucketing/ discretization is determined by sorting the input by weight and then dividing into equal ranges. Then, scores within each range are assigned to that bucket.
Note that this means that even large differences in weights may be lost during automaton construction, but the overall distinction between "classes" of weights will be preserved regardless of the distribution of weights.
For fine-grained control over which weights are assigned to which buckets,
use FSTCompletion
directly or TSTLookup
, for example.
FSTCompletion
Lookup.LookupPriorityQueue, Lookup.LookupResult
CHARSEQUENCE_COMPARATOR
Constructor and Description |
---|
FSTCompletionLookup()
This constructor should only be used to read a previously saved suggester.
|
FSTCompletionLookup(Directory tempDir,
String tempFileNamePrefix)
This constructor prepares for creating a suggested FST using the
build(InputIterator) method. |
FSTCompletionLookup(Directory tempDir,
String tempFileNamePrefix,
FSTCompletion completion,
boolean exactMatchFirst)
This constructor takes a pre-built automaton.
|
FSTCompletionLookup(Directory tempDir,
String tempFileNamePrefix,
int buckets,
boolean exactMatchFirst)
This constructor prepares for creating a suggested FST using the
build(InputIterator) method. |
Modifier and Type | Method and Description |
---|---|
void |
build(InputIterator iterator)
Builds up a new internal
Lookup representation based on the given InputIterator . |
Object |
get(CharSequence key)
Returns the bucket (weight) as a Long for the provided key if it exists,
otherwise null if it does not.
|
Collection<Accountable> |
getChildResources() |
long |
getCount()
Get the number of entries the lookup was built with
|
boolean |
load(DataInput input)
Discard current lookup data and load it from a previously saved copy.
|
List<Lookup.LookupResult> |
lookup(CharSequence key,
Set<BytesRef> contexts,
boolean higherWeightsFirst,
int num)
Look up a key and return possible completion for this key.
|
long |
ramBytesUsed() |
boolean |
store(DataOutput output)
Persist the constructed lookup data to a directory.
|
public FSTCompletionLookup()
public FSTCompletionLookup(Directory tempDir, String tempFileNamePrefix)
build(InputIterator)
method. The number of weight
discretization buckets is set to FSTCompletion.DEFAULT_BUCKETS
and
exact matches are promoted to the top of the suggestions list.public FSTCompletionLookup(Directory tempDir, String tempFileNamePrefix, int buckets, boolean exactMatchFirst)
build(InputIterator)
method.buckets
- The number of weight discretization buckets (see
FSTCompletion
for details).exactMatchFirst
- If true
exact matches are promoted to the top of the
suggestions list. Otherwise they appear in the order of
discretized weight and alphabetical within the bucket.public FSTCompletionLookup(Directory tempDir, String tempFileNamePrefix, FSTCompletion completion, boolean exactMatchFirst)
completion
- An instance of FSTCompletion
.exactMatchFirst
- If true
exact matches are promoted to the top of the
suggestions list. Otherwise they appear in the order of
discretized weight and alphabetical within the bucket.public void build(InputIterator iterator) throws IOException
Lookup
Lookup
representation based on the given InputIterator
.
The implementation might re-sort the data internally.build
in class Lookup
IOException
public List<Lookup.LookupResult> lookup(CharSequence key, Set<BytesRef> contexts, boolean higherWeightsFirst, int num)
Lookup
lookup
in class Lookup
key
- lookup key. Depending on the implementation this may be
a prefix, misspelling, or even infix.contexts
- contexts to filter the lookup by, or null if all contexts are allowed; if the suggestion contains any of the contexts, it's a matchhigherWeightsFirst
- return only more popular resultsnum
- maximum number of results to returnpublic Object get(CharSequence key)
public boolean store(DataOutput output) throws IOException
Lookup
store
in class Lookup
output
- DataOutput
to write the data to.IOException
- when fatal IO error occurs.public boolean load(DataInput input) throws IOException
Lookup
load
in class Lookup
input
- the DataInput
to load the lookup data.IOException
- when fatal IO error occurs.public long ramBytesUsed()
ramBytesUsed
in interface Accountable
public Collection<Accountable> getChildResources()
getChildResources
in interface Accountable
Copyright © 2000-2017 Apache Software Foundation. All Rights Reserved.