Class BKDWriter

  • All Implemented Interfaces:
    Closeable, AutoCloseable

    public class BKDWriter
    extends Object
    implements Closeable
    Recursively builds a block KD-tree to assign all incoming points in N-dim space to smaller and smaller N-dim rectangles (cells) until the number of points in a given rectangle is <= config.maxPointsInLeafNode. The tree is partially balanced, which means the leaf nodes will have the requested config.maxPointsInLeafNode except one that might have less. Leaf nodes may straddle the two bottom levels of the binary tree. Values that fall exactly on a cell boundary may be in either cell.

    The number of dimensions can be 1 to 8, but every byte[] value is fixed length.

    This consumes heap during writing: it allocates a Long[numLeaves], a byte[numLeaves*(1+config.bytesPerDim)] and then uses up to the specified maxMBSortInHeap heap space for writing.

    NOTE: This can write at most Integer.MAX_VALUE * config.maxPointsInLeafNode / config.bytesPerDim total points.

    WARNING: This API is experimental and might change in incompatible ways in the next release.
    • Field Detail

      • VERSION_LEAF_STORES_BOUNDS

        public static final int VERSION_LEAF_STORES_BOUNDS
        See Also:
        Constant Field Values
      • VERSION_SELECTIVE_INDEXING

        public static final int VERSION_SELECTIVE_INDEXING
        See Also:
        Constant Field Values
      • VERSION_LOW_CARDINALITY_LEAVES

        public static final int VERSION_LOW_CARDINALITY_LEAVES
        See Also:
        Constant Field Values
      • DEFAULT_MAX_MB_SORT_IN_HEAP

        public static final float DEFAULT_MAX_MB_SORT_IN_HEAP
        Default maximum heap to use, before spilling to (slower) disk
        See Also:
        Constant Field Values
      • config

        protected final BKDConfig config
        BKD tree configuration
      • minPackedValue

        protected final byte[] minPackedValue
        Minimum per-dim values, packed
      • maxPackedValue

        protected final byte[] maxPackedValue
        Maximum per-dim values, packed
      • pointCount

        protected long pointCount
    • Constructor Detail

      • BKDWriter

        public BKDWriter​(int maxDoc,
                         Directory tempDir,
                         String tempFileNamePrefix,
                         BKDConfig config,
                         double maxMBSortInHeap,
                         long totalPointCount)