Class MSBRadixSorter

  • Direct Known Subclasses:
    StableMSBRadixSorter, StringSorter.MSBStringRadixSorter

    public abstract class MSBRadixSorter
    extends Sorter
    Radix sorter for variable-length strings. This class sorts based on the most significant byte first and falls back to IntroSorter when the size of the buckets to sort becomes small.

    This algorithm is NOT stable. Worst-case memory usage is about 2.3 KB.

    NOTE: This API is for internal purposes only and might change in incompatible ways in the next release.
    • Constructor Summary

      Constructors 
      Modifier Constructor Description
      protected MSBRadixSorter​(int maxLength)
      Sole constructor.
    • Method Summary

      All Methods Instance Methods Abstract Methods Concrete Methods 
      Modifier and Type Method Description
      protected void buildHistogram​(int prefixCommonBucket, int prefixCommonLen, int from, int to, int k, int[] histogram)
      Build an histogram of the k-th characters of values occurring between offsets from and to, using getBucket(int, int).
      protected abstract int byteAt​(int i, int k)
      Return the k-th byte of the entry at index i, or -1 if its length is less than or equal to k.
      protected int compare​(int i, int j)
      Compare entries found in slots i and j.
      protected int getBucket​(int i, int k)
      Return a number for the k-th character between 0 and HISTOGRAM_SIZE.
      protected Sorter getFallbackSorter​(int k)
      Get a fall-back sorter which may assume that the first k bytes of all compared strings are equal.
      protected void reorder​(int from, int to, int[] startOffsets, int[] endOffsets, int k)
      Reorder based on start/end offsets for each bucket.
      protected boolean shouldFallback​(int from, int to, int l)  
      void sort​(int from, int to)
      Sort the slice which starts at from (inclusive) and ends at to (exclusive).
      protected void sort​(int from, int to, int k, int l)  
    • Constructor Detail

      • MSBRadixSorter

        protected MSBRadixSorter​(int maxLength)
        Sole constructor.
        Parameters:
        maxLength - the maximum length of keys, pass Integer.MAX_VALUE if unknown.
    • Method Detail

      • byteAt

        protected abstract int byteAt​(int i,
                                      int k)
        Return the k-th byte of the entry at index i, or -1 if its length is less than or equal to k. This may only be called with a value of i between 0 included and maxLength excluded.
      • getFallbackSorter

        protected Sorter getFallbackSorter​(int k)
        Get a fall-back sorter which may assume that the first k bytes of all compared strings are equal.
      • sort

        public void sort​(int from,
                         int to)
        Description copied from class: Sorter
        Sort the slice which starts at from (inclusive) and ends at to (exclusive).
        Specified by:
        sort in class Sorter
      • sort

        protected void sort​(int from,
                            int to,
                            int k,
                            int l)
      • shouldFallback

        protected boolean shouldFallback​(int from,
                                         int to,
                                         int l)
      • getBucket

        protected int getBucket​(int i,
                                int k)
        Return a number for the k-th character between 0 and HISTOGRAM_SIZE.
      • buildHistogram

        protected void buildHistogram​(int prefixCommonBucket,
                                      int prefixCommonLen,
                                      int from,
                                      int to,
                                      int k,
                                      int[] histogram)
        Build an histogram of the k-th characters of values occurring between offsets from and to, using getBucket(int, int).
      • reorder

        protected void reorder​(int from,
                               int to,
                               int[] startOffsets,
                               int[] endOffsets,
                               int k)
        Reorder based on start/end offsets for each bucket. When this method returns, startOffsets and endOffsets are equal.
        Parameters:
        startOffsets - start offsets per bucket
        endOffsets - end offsets per bucket