org.apache.lucene.util.fst
Class FST<T>

java.lang.Object
  extended by org.apache.lucene.util.fst.FST<T>

public final class FST<T>
extends Object

Represents an finite state machine (FST), using a compact byte[] format.

The format is similar to what's used by Morfologik (http://sourceforge.net/projects/morfologik).

See the package documentation for some simple examples.

WARNING: This API is experimental and might change in incompatible ways in the next release.

Nested Class Summary
static class FST.Arc<T>
          Represents a single arc.
static class FST.BytesReader
          Reads bytes stored in an FST.
static class FST.INPUT_TYPE
          Specifies allowed range of each int input label for this FST.
 
Field Summary
 long arcCount
           
 long arcWithOutputCount
           
static int DEFAULT_MAX_BLOCK_BITS
           
static int END_LABEL
          If arc has this label then that arc is final/accepted
 FST.INPUT_TYPE inputType
           
 long nodeCount
           
 Outputs<T> outputs
           
 
Constructor Summary
FST(DataInput in, Outputs<T> outputs)
          Load a previously saved FST.
FST(DataInput in, Outputs<T> outputs, int maxBlockBits)
          Load a previously saved FST; maxBlockBits allows you to control the size of the byte[] pages used to hold the FST bytes.
 
Method Summary
 FST.Arc<T> findTargetArc(int labelToMatch, FST.Arc<T> follow, FST.Arc<T> arc, FST.BytesReader in)
          Finds an arc leaving the incoming arc, replacing the arc in place.
 long getArcCount()
           
 long getArcWithOutputCount()
           
 FST.BytesReader getBytesReader()
          Returns a FST.BytesReader for this FST, positioned at position 0.
 T getEmptyOutput()
           
 FST.Arc<T> getFirstArc(FST.Arc<T> arc)
          Fills virtual 'start' arc, ie, an empty incoming arc to the FST's start node
 FST.INPUT_TYPE getInputType()
           
 long getNodeCount()
           
static
<T> FST<T>
read(File file, Outputs<T> outputs)
          Reads an automaton from a file.
 FST.Arc<T> readFirstRealTargetArc(long node, FST.Arc<T> arc, FST.BytesReader in)
           
 FST.Arc<T> readFirstTargetArc(FST.Arc<T> follow, FST.Arc<T> arc, FST.BytesReader in)
          Follow the follow arc and read the first arc of its target; this changes the provided arc (2nd arg) in-place and returns it.
 FST.Arc<T> readLastTargetArc(FST.Arc<T> follow, FST.Arc<T> arc, FST.BytesReader in)
          Follows the follow arc and reads the last arc of its target; this changes the provided arc (2nd arg) in-place and returns it.
 FST.Arc<T> readNextArc(FST.Arc<T> arc, FST.BytesReader in)
          In-place read; returns the arc.
 int readNextArcLabel(FST.Arc<T> arc, FST.BytesReader in)
          Peeks at next arc's label; does not alter arc.
 FST.Arc<T> readNextRealArc(FST.Arc<T> arc, FST.BytesReader in)
          Never returns null, but you should never call this if arc.isLast() is true.
 void readRootArcs(FST.Arc<T>[] arcs)
           
 void save(DataOutput out)
           
 void save(File file)
          Writes an automaton to a file.
 long sizeInBytes()
          Returns bytes used to represent the FST
static
<T> boolean
targetHasArcs(FST.Arc<T> arc)
          returns true if the node at this address has any outgoing arcs
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

inputType

public final FST.INPUT_TYPE inputType

outputs

public final Outputs<T> outputs

nodeCount

public long nodeCount

arcCount

public long arcCount

arcWithOutputCount

public long arcWithOutputCount

END_LABEL

public static final int END_LABEL
If arc has this label then that arc is final/accepted

See Also:
Constant Field Values

DEFAULT_MAX_BLOCK_BITS

public static final int DEFAULT_MAX_BLOCK_BITS
Constructor Detail

FST

public FST(DataInput in,
           Outputs<T> outputs)
    throws IOException
Load a previously saved FST.

Throws:
IOException

FST

public FST(DataInput in,
           Outputs<T> outputs,
           int maxBlockBits)
    throws IOException
Load a previously saved FST; maxBlockBits allows you to control the size of the byte[] pages used to hold the FST bytes.

Throws:
IOException
Method Detail

getInputType

public FST.INPUT_TYPE getInputType()

sizeInBytes

public long sizeInBytes()
Returns bytes used to represent the FST


readRootArcs

public void readRootArcs(FST.Arc<T>[] arcs)
                  throws IOException
Throws:
IOException

getEmptyOutput

public T getEmptyOutput()

save

public void save(DataOutput out)
          throws IOException
Throws:
IOException

save

public void save(File file)
          throws IOException
Writes an automaton to a file.

Throws:
IOException

read

public static <T> FST<T> read(File file,
                              Outputs<T> outputs)
                   throws IOException
Reads an automaton from a file.

Throws:
IOException

targetHasArcs

public static <T> boolean targetHasArcs(FST.Arc<T> arc)
returns true if the node at this address has any outgoing arcs


getFirstArc

public FST.Arc<T> getFirstArc(FST.Arc<T> arc)
Fills virtual 'start' arc, ie, an empty incoming arc to the FST's start node


readLastTargetArc

public FST.Arc<T> readLastTargetArc(FST.Arc<T> follow,
                                    FST.Arc<T> arc,
                                    FST.BytesReader in)
                             throws IOException
Follows the follow arc and reads the last arc of its target; this changes the provided arc (2nd arg) in-place and returns it.

Returns:
Returns the second argument (arc).
Throws:
IOException

readFirstTargetArc

public FST.Arc<T> readFirstTargetArc(FST.Arc<T> follow,
                                     FST.Arc<T> arc,
                                     FST.BytesReader in)
                              throws IOException
Follow the follow arc and read the first arc of its target; this changes the provided arc (2nd arg) in-place and returns it.

Returns:
Returns the second argument (arc).
Throws:
IOException

readFirstRealTargetArc

public FST.Arc<T> readFirstRealTargetArc(long node,
                                         FST.Arc<T> arc,
                                         FST.BytesReader in)
                                  throws IOException
Throws:
IOException

readNextArc

public FST.Arc<T> readNextArc(FST.Arc<T> arc,
                              FST.BytesReader in)
                       throws IOException
In-place read; returns the arc.

Throws:
IOException

readNextArcLabel

public int readNextArcLabel(FST.Arc<T> arc,
                            FST.BytesReader in)
                     throws IOException
Peeks at next arc's label; does not alter arc. Do not call this if arc.isLast()!

Throws:
IOException

readNextRealArc

public FST.Arc<T> readNextRealArc(FST.Arc<T> arc,
                                  FST.BytesReader in)
                           throws IOException
Never returns null, but you should never call this if arc.isLast() is true.

Throws:
IOException

findTargetArc

public FST.Arc<T> findTargetArc(int labelToMatch,
                                FST.Arc<T> follow,
                                FST.Arc<T> arc,
                                FST.BytesReader in)
                         throws IOException
Finds an arc leaving the incoming arc, replacing the arc in place. This returns null if the arc was not found, else the incoming arc.

Throws:
IOException

getNodeCount

public long getNodeCount()

getArcCount

public long getArcCount()

getArcWithOutputCount

public long getArcWithOutputCount()

getBytesReader

public FST.BytesReader getBytesReader()
Returns a FST.BytesReader for this FST, positioned at position 0.



Copyright © 2000-2014 Apache Software Foundation. All Rights Reserved.