public final class Util extends Object
Modifier and Type | Class and Description |
---|---|
static class |
Util.MinResult<T>
Holds a single input (IntsRef) + output, returned by
shortestPaths(org.apache.lucene.util.fst.FST<T>, org.apache.lucene.util.fst.FST.Arc<T>, java.util.Comparator<T>, int) . |
Modifier and Type | Method and Description |
---|---|
static <T> T |
get(FST<T> fst,
BytesRef input)
Looks up the output for this input, or null if the
input is not accepted
|
static <T> T |
get(FST<T> fst,
IntsRef input)
Looks up the output for this input, or null if the
input is not accepted.
|
static IntsRef |
getByOutput(FST<Long> fst,
long targetOutput)
Reverse lookup (lookup by output instead of by input),
in the special case when your FSTs outputs are
strictly ascending.
|
static <T> Util.MinResult<T>[] |
shortestPaths(FST<T> fst,
FST.Arc<T> fromNode,
Comparator<T> comparator,
int topN)
Starting from node, find the top N min cost
completions to a final node.
|
static BytesRef |
toBytesRef(IntsRef input,
BytesRef scratch)
Just converts IntsRef to BytesRef; you must ensure the
int values fit into a byte.
|
static <T> void |
toDot(FST<T> fst,
Writer out,
boolean sameRank,
boolean labelStates)
Dumps an
FST to a GraphViz's dot language description
for visualization. |
static IntsRef |
toIntsRef(BytesRef input,
IntsRef scratch)
Just takes unsigned byte values from the BytesRef and
converts into an IntsRef.
|
static IntsRef |
toUTF32(char[] s,
int offset,
int length,
IntsRef scratch)
Decodes the Unicode codepoints from the provided
char[] and places them in the provided scratch
IntsRef, which must not be null, returning it.
|
static IntsRef |
toUTF32(CharSequence s,
IntsRef scratch)
Decodes the Unicode codepoints from the provided
CharSequence and places them in the provided scratch
IntsRef, which must not be null, returning it.
|
public static <T> T get(FST<T> fst, IntsRef input) throws IOException
IOException
public static <T> T get(FST<T> fst, BytesRef input) throws IOException
IOException
public static IntsRef getByOutput(FST<Long> fst, long targetOutput) throws IOException
NOTE: this only works with FSTPositiveIntOutputs.getSingleton(boolean)
).
For example, simple ordinals (0, 1,
2, ...), or file offets (when appending to a file)
fit this.
IOException
public static <T> Util.MinResult<T>[] shortestPaths(FST<T> fst, FST.Arc<T> fromNode, Comparator<T> comparator, int topN) throws IOException
NOTE: you must share the outputs when you build the
FST (pass doShare=true to PositiveIntOutputs.getSingleton(boolean)
).
IOException
public static <T> void toDot(FST<T> fst, Writer out, boolean sameRank, boolean labelStates) throws IOException
FST
to a GraphViz's dot
language description
for visualization. Example of use:
PrintWriter pw = new PrintWriter("out.dot"); Util.toDot(fst, pw, true, true); pw.close();and then, from command line:
dot -Tpng -o out.png out.dot
Note: larger FSTs (a few thousand nodes) won't even render, don't bother.
sameRank
- If true
, the resulting dot
file will try
to order states in layers of breadth-first traversal. This may
mess up arcs, but makes the output FST's structure a bit clearer.labelStates
- If true
states will have labels equal to their offsets in their
binary format. Expands the graph considerably.IOException
public static IntsRef toUTF32(CharSequence s, IntsRef scratch)
public static IntsRef toUTF32(char[] s, int offset, int length, IntsRef scratch)
public static IntsRef toIntsRef(BytesRef input, IntsRef scratch)