Class FSTCompiler.Builder<T>

  • Enclosing class:
    FSTCompiler<T>

    public static class FSTCompiler.Builder<T>
    extends Object
    Fluent-style constructor for FST FSTCompiler.

    Creates an FST/FSA builder with all the possible tuning and construction tweaks. Read parameter documentation carefully.

    • Method Detail

      • suffixRAMLimitMB

        public FSTCompiler.Builder<T> suffixRAMLimitMB​(double mb)
        The approximate maximum amount of RAM (in MB) to use holding the suffix cache, which enables the FST to share common suffixes. Pass Double.POSITIVE_INFINITY to keep all suffixes and create an exactly minimal FST. In this case, the amount of RAM actually used will be bounded by the number of unique suffixes. If you pass a value smaller than the builder would use, the least recently used suffixes will be discarded, thus reducing suffix sharing and creating a non-minimal FST. In this case, the larger the limit, the closer the FST will be to its true minimal size, with diminishing returns as you increase the limit. Pass 0 to disable suffix sharing entirely, but note that the resulting FST can be substantially larger than the minimal FST.

        Note that this is not a precise limit. The current implementation uses hash tables to map the suffixes, and approximates the rough overhead (unused slots) in the hash table.

        Default = 32.0 MB.

      • allowFixedLengthArcs

        public FSTCompiler.Builder<T> allowFixedLengthArcs​(boolean allowFixedLengthArcs)
        Pass false to disable the fixed length arc optimization (binary search or direct addressing) while building the FST; this will make the resulting FST smaller but slower to traverse.

        Default = true.

      • directAddressingMaxOversizingFactor

        public FSTCompiler.Builder<T> directAddressingMaxOversizingFactor​(float factor)
        Overrides the default the maximum oversizing of fixed array allowed to enable direct addressing of arcs instead of binary search.

        Setting this factor to a negative value (e.g. -1) effectively disables direct addressing, only binary search nodes will be created.

        This factor does not determine whether to encode a node with a list of variable length arcs or with fixed length arcs. It only determines the effective encoding of a node that is already known to be encoded with fixed length arcs.

        Default = 1.

      • setVersion

        public FSTCompiler.Builder<T> setVersion​(int version)
        Expert: Set the codec version. *