Package org.apache.lucene.util.fst
Class FST<T>
java.lang.Object
org.apache.lucene.util.fst.FST<T>
- All Implemented Interfaces:
Accountable
Represents an finite state machine (FST), using a compact byte[] format.
The format is similar to what's used by Morfologik (https://github.com/morfologik/morfologik-stemming).
See the package documentation
for some simple examples.
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic final class
Represents a single arc.static class
Reads bytes stored in an FST.static enum
Specifies allowed range of each int input label for this FST. -
Field Summary
Modifier and TypeFieldDescriptionstatic final byte
Value of the arc flags to declare a node with fixed length arcs designed for binary search.static final int
This flag is set if the arc has an output.static final int
If arc has this label then that arc is final/acceptedFields inherited from interface org.apache.lucene.util.Accountable
NULL_ACCOUNTABLE
-
Constructor Summary
ConstructorDescriptionLoad a previously saved FST.Load a previously saved FST; maxBlockBits allows you to control the size of the byte[] pages used to hold the FST bytes. -
Method Summary
Modifier and TypeMethodDescriptionfindTargetArc
(int labelToMatch, FST.Arc<T> follow, FST.Arc<T> arc, FST.BytesReader in) Finds an arc leaving the incoming arc, replacing the arc in place.Returns aFST.BytesReader
for this FST, positioned at position 0.getFirstArc
(FST.Arc<T> arc) Fills virtual 'start' arc, ie, an empty incoming arc to the FST's start nodelong
Return the memory usage of this object in bytes.static <T> FST<T>
Reads an automaton from a file.readArcByDirectAddressing
(FST.Arc<T> arc, FST.BytesReader in, int rangeIndex) Reads a present direct addressing node arc, with the provided index in the label range.readArcByIndex
(FST.Arc<T> arc, FST.BytesReader in, int idx) readFirstRealTargetArc
(long nodeAddress, FST.Arc<T> arc, FST.BytesReader in) readFirstTargetArc
(FST.Arc<T> follow, FST.Arc<T> arc, FST.BytesReader in) Follow thefollow
arc and read the first arc of its target; this changes the providedarc
(2nd arg) in-place and returns it.int
Reads one BYTE1/2/4 label from the providedDataInput
.readLastArcByDirectAddressing
(FST.Arc<T> arc, FST.BytesReader in) Reads the last arc of a direct addressing node.readNextArc
(FST.Arc<T> arc, FST.BytesReader in) In-place read; returns the arc.readNextRealArc
(FST.Arc<T> arc, FST.BytesReader in) Never returns null, but you should never call this if arc.isLast() is true.void
Writes an automaton to a file.void
save
(DataOutput metaOut, DataOutput out) static <T> boolean
targetHasArcs
(FST.Arc<T> arc) returns true if the node at this address has any outgoing arcstoString()
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
Methods inherited from interface org.apache.lucene.util.Accountable
getChildResources
-
Field Details
-
BIT_ARC_HAS_OUTPUT
public static final int BIT_ARC_HAS_OUTPUTThis flag is set if the arc has an output.- See Also:
-
ARCS_FOR_BINARY_SEARCH
public static final byte ARCS_FOR_BINARY_SEARCHValue of the arc flags to declare a node with fixed length arcs designed for binary search.- See Also:
-
END_LABEL
public static final int END_LABELIf arc has this label then that arc is final/accepted- See Also:
-
outputs
-
-
Constructor Details
-
FST
Load a previously saved FST.- Throws:
IOException
-
FST
public FST(DataInput metaIn, DataInput in, Outputs<T> outputs, FSTStore fstStore) throws IOException Load a previously saved FST; maxBlockBits allows you to control the size of the byte[] pages used to hold the FST bytes.- Throws:
IOException
-
-
Method Details
-
ramBytesUsed
public long ramBytesUsed()Description copied from interface:Accountable
Return the memory usage of this object in bytes. Negative values are illegal.- Specified by:
ramBytesUsed
in interfaceAccountable
-
toString
-
getEmptyOutput
-
save
- Throws:
IOException
-
save
Writes an automaton to a file.- Throws:
IOException
-
read
Reads an automaton from a file.- Throws:
IOException
-
readLabel
Reads one BYTE1/2/4 label from the providedDataInput
.- Throws:
IOException
-
targetHasArcs
returns true if the node at this address has any outgoing arcs -
getFirstArc
Fills virtual 'start' arc, ie, an empty incoming arc to the FST's start node -
readFirstTargetArc
public FST.Arc<T> readFirstTargetArc(FST.Arc<T> follow, FST.Arc<T> arc, FST.BytesReader in) throws IOException Follow thefollow
arc and read the first arc of its target; this changes the providedarc
(2nd arg) in-place and returns it.- Returns:
- Returns the second argument (
arc
). - Throws:
IOException
-
readFirstRealTargetArc
public FST.Arc<T> readFirstRealTargetArc(long nodeAddress, FST.Arc<T> arc, FST.BytesReader in) throws IOException - Throws:
IOException
-
readNextArc
In-place read; returns the arc.- Throws:
IOException
-
readArcByIndex
- Throws:
IOException
-
readArcByDirectAddressing
public FST.Arc<T> readArcByDirectAddressing(FST.Arc<T> arc, FST.BytesReader in, int rangeIndex) throws IOException Reads a present direct addressing node arc, with the provided index in the label range.- Parameters:
rangeIndex
- The index of the arc in the label range. It must be present. The real arc offset is computed based on the presence bits of the direct addressing node.- Throws:
IOException
-
readLastArcByDirectAddressing
public FST.Arc<T> readLastArcByDirectAddressing(FST.Arc<T> arc, FST.BytesReader in) throws IOException Reads the last arc of a direct addressing node. This method is equivalent to callreadArcByDirectAddressing(Arc, BytesReader, int)
withrangeIndex
equal toarc.numArcs() - 1
, but it is faster.- Throws:
IOException
-
readNextRealArc
Never returns null, but you should never call this if arc.isLast() is true.- Throws:
IOException
-
findTargetArc
public FST.Arc<T> findTargetArc(int labelToMatch, FST.Arc<T> follow, FST.Arc<T> arc, FST.BytesReader in) throws IOException Finds an arc leaving the incoming arc, replacing the arc in place. This returns null if the arc was not found, else the incoming arc.- Throws:
IOException
-
getBytesReader
Returns aFST.BytesReader
for this FST, positioned at position 0.
-