Class FSTCompiler<T>

  • public class FSTCompiler<T>
    extends Object
    Builds a minimal FST (maps an IntsRef term to an arbitrary output) from pre-sorted terms with outputs. The FST becomes an FSA if you use NoOutputs. The FST is written on-the-fly into a compact serialized format byte array, which can be saved to / loaded from a Directory or used directly for traversal. The FST is always finite (no cycles).

    NOTE: The algorithm is described at

    The parameterized type T is the output type. See the subclasses of Outputs.

    FSTs larger than 2.1GB are now possible (as of Lucene 4.2). FSTs containing more than 2.1B nodes are also now possible, however they cannot be packed.

    WARNING: This API is experimental and might change in incompatible ways in the next release.
    • Method Detail

      • getDirectAddressingMaxOversizingFactor

        public float getDirectAddressingMaxOversizingFactor()
      • getNodeCount

        public long getNodeCount()
      • getArcCount

        public long getArcCount()
      • getMappedStateCount

        public long getMappedStateCount()
      • compile

        public FST<T> compile()
                       throws IOException
        Returns final FST. NOTE: this will return null if nothing is accepted by the FST.
      • fstRamBytesUsed

        public long fstRamBytesUsed()
      • fstSizeInBytes

        public long fstSizeInBytes()