Class FSTCompiler.Builder<T>

java.lang.Object
org.apache.lucene.util.fst.FSTCompiler.Builder<T>
Enclosing class:
FSTCompiler<T>

public static class FSTCompiler.Builder<T> extends Object
Fluent-style constructor for FST FSTCompiler.

Creates an FST/FSA builder with all the possible tuning and construction tweaks. Read parameter documentation carefully.

  • Constructor Details

  • Method Details

    • minSuffixCount1

      public FSTCompiler.Builder<T> minSuffixCount1(int minSuffixCount1)
      If pruning the input graph during construction, this threshold is used for telling if a node is kept or pruned. If transition_count(node) >= minSuffixCount1, the node is kept.

      Default = 0.

    • minSuffixCount2

      public FSTCompiler.Builder<T> minSuffixCount2(int minSuffixCount2)
      Better pruning: we prune node (and all following nodes) if the prior node has less than this number of terms go through it.

      Default = 0.

    • shouldShareSuffix

      public FSTCompiler.Builder<T> shouldShareSuffix(boolean shouldShareSuffix)
      If true, the shared suffixes will be compacted into unique paths. This requires an additional RAM-intensive hash map for lookups in memory. Setting this parameter to false creates a single suffix path for all input sequences. This will result in a larger FST, but requires substantially less memory and CPU during building.

      Default = true.

    • shouldShareNonSingletonNodes

      public FSTCompiler.Builder<T> shouldShareNonSingletonNodes(boolean shouldShareNonSingletonNodes)
      Only used if shouldShareSuffix is true. Set this to true to ensure FST is fully minimal, at cost of more CPU and more RAM during building.

      Default = true.

    • shareMaxTailLength

      public FSTCompiler.Builder<T> shareMaxTailLength(int shareMaxTailLength)
      Only used if shouldShareSuffix is true. Set this to Integer.MAX_VALUE to ensure FST is fully minimal, at cost of more CPU and more RAM during building.

      Default = Integer.MAX_VALUE.

    • allowFixedLengthArcs

      public FSTCompiler.Builder<T> allowFixedLengthArcs(boolean allowFixedLengthArcs)
      Pass false to disable the fixed length arc optimization (binary search or direct addressing) while building the FST; this will make the resulting FST smaller but slower to traverse.

      Default = true.

    • bytesPageBits

      public FSTCompiler.Builder<T> bytesPageBits(int bytesPageBits)
      How many bits wide to make each byte[] block in the BytesStore; if you know the FST will be large then make this larger. For example 15 bits = 32768 byte pages.

      Default = 15.

    • directAddressingMaxOversizingFactor

      public FSTCompiler.Builder<T> directAddressingMaxOversizingFactor(float factor)
      Overrides the default the maximum oversizing of fixed array allowed to enable direct addressing of arcs instead of binary search.

      Setting this factor to a negative value (e.g. -1) effectively disables direct addressing, only binary search nodes will be created.

      This factor does not determine whether to encode a node with a list of variable length arcs or with fixed length arcs. It only determines the effective encoding of a node that is already known to be encoded with fixed length arcs.

      Default = 1.

    • build

      public FSTCompiler<T> build()
      Creates a new FSTCompiler.