|
|||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | ||||||||
See:
Description
| Class Summary | |
|---|---|
| Builder<T> | Builds a minimal FST (maps an IntsRef term to an arbitrary output) from pre-sorted terms with outputs. |
| Builder.Arc<T> | Expert: holds a pending (seen but not yet serialized) arc. |
| Builder.FreezeTail<T> | Expert: this is invoked by Builder whenever a suffix is serialized. |
| Builder.UnCompiledNode<T> | Expert: holds a pending (seen but not yet serialized) Node. |
| ByteSequenceOutputs | An FST Outputs implementation where each output
is a sequence of bytes. |
| BytesRefFSTEnum<T> | Enumerates all input (BytesRef) + output pairs in an FST. |
| BytesRefFSTEnum.InputOutput<T> | Holds a single input (BytesRef) + output pair. |
| CharSequenceOutputs | An FST Outputs implementation where each output
is a sequence of characters. |
| FST<T> | Represents an finite state machine (FST), using a compact byte[] format. |
| FST.Arc<T> | Represents a single arc. |
| FST.BytesReader | Reads bytes stored in an FST. |
| IntSequenceOutputs | An FST Outputs implementation where each output
is a sequence of ints. |
| IntsRefFSTEnum<T> | Enumerates all input (IntsRef) + output pairs in an FST. |
| IntsRefFSTEnum.InputOutput<T> | Holds a single input (IntsRef) + output pair. |
| NoOutputs | A null FST Outputs implementation; use this if
you just want to build an FSA. |
| Outputs<T> | Represents the outputs for an FST, providing the basic algebra required for building and traversing the FST. |
| PairOutputs<A,B> | An FST Outputs implementation, holding two other outputs. |
| PairOutputs.Pair<A,B> | Holds a single pair of two outputs. |
| PositiveIntOutputs | An FST Outputs implementation where each output
is a non-negative long value. |
| Util | Static helper methods. |
| Util.FSTPath<T> | Represents a path in TopNSearcher. |
| Util.MinResult<T> | Holds a single input (IntsRef) + output, returned by
shortestPaths(). |
| Util.TopNSearcher<T> | Utility class to find top N shortest paths from start point(s). |
| Enum Summary | |
|---|---|
| FST.INPUT_TYPE | Specifies allowed range of each int input label for this FST. |
Finite state transducers
This package implements Finite State Transducers with the following characteristics:
FST.pack()Lookup-by-output when the
outputs are in sorted order (e.g., ordinals or file pointers)Outputs representationN-shortest-paths search by
weightIntsRef and BytesRef) that behave like SortedMap iterators
FST Construction example:
// Input values (keys). These must be provided to Builder in Unicode sorted order!
String inputValues[] = {"cat", "dog", "dogs"};
long outputValues[] = {5, 7, 12};
PositiveIntOutputs outputs = PositiveIntOutputs.getSingleton();
Builder<Long> builder = new Builder<Long>(INPUT_TYPE.BYTE1, outputs);
BytesRef scratchBytes = new BytesRef();
IntsRef scratchInts = new IntsRef();
for (int i = 0; i < inputValues.length; i++) {
scratchBytes.copyChars(inputValues[i]);
builder.add(Util.toIntsRef(scratchBytes, scratchInts), outputValues[i]);
}
FST<Long> fst = builder.finish();
Retrieval by key:
Long value = Util.get(fst, new BytesRef("dog"));
System.out.println(value); // 7
Retrieval by value:
// Only works because outputs are also in sorted order
IntsRef key = Util.getByOutput(fst, 12);
System.out.println(Util.toBytesRef(key, scratchBytes).utf8ToString()); // dogs
Iterate over key-value pairs in sorted order:
// Like TermsEnum, this also supports seeking (advance)
BytesRefFSTEnum<Long> iterator = new BytesRefFSTEnum<Long>(fst);
while (iterator.next() != null) {
InputOutput<Long> mapEntry = iterator.current();
System.out.println(mapEntry.input.utf8ToString());
System.out.println(mapEntry.output);
}
N-shortest paths by weight:
Comparator<Long> comparator = new Comparator<Long>() {
public int compare(Long left, Long right) {
return left.compareTo(right);
}
};
Arc<Long> firstArc = fst.getFirstArc(new Arc<Long>());
MinResult<Long> paths[] = Util.shortestPaths(fst, firstArc, comparator, 2);
System.out.println(Util.toBytesRef(paths[0].input, scratchBytes).utf8ToString()); // cat
System.out.println(paths[0].output); // 5
System.out.println(Util.toBytesRef(paths[1].input, scratchBytes).utf8ToString()); // dog
System.out.println(paths[1].output); // 7
|
|||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | ||||||||