Class Utility


  • public class Utility
    extends Object
    SmartChineseAnalyzer utility constants and methods
    WARNING: This API is experimental and might change in incompatible ways in the next release.
    • Field Detail

      • STRING_CHAR_ARRAY

        public static final char[] STRING_CHAR_ARRAY
      • NUMBER_CHAR_ARRAY

        public static final char[] NUMBER_CHAR_ARRAY
      • START_CHAR_ARRAY

        public static final char[] START_CHAR_ARRAY
      • END_CHAR_ARRAY

        public static final char[] END_CHAR_ARRAY
      • COMMON_DELIMITER

        public static final char[] COMMON_DELIMITER
        Delimiters will be filtered to this character by SegTokenFilter
      • SPACES

        public static final String SPACES
        Space-like characters that need to be skipped: such as space, tab, newline, carriage return.
        See Also:
        Constant Field Values
      • MAX_FREQUENCE

        public static final int MAX_FREQUENCE
        Maximum bigram frequency (used in the smoothing function).
        See Also:
        Constant Field Values
    • Constructor Detail

      • Utility

        public Utility()
    • Method Detail

      • compareArray

        public static int compareArray​(char[] larray,
                                       int lstartIndex,
                                       char[] rarray,
                                       int rstartIndex)
        compare two arrays starting at the specified offsets.
        Parameters:
        larray - left array
        lstartIndex - start offset into larray
        rarray - right array
        rstartIndex - start offset into rarray
        Returns:
        0 if the arrays are equal,1 if larray > rarray, -1 if larray < rarray
      • compareArrayByPrefix

        public static int compareArrayByPrefix​(char[] shortArray,
                                               int shortIndex,
                                               char[] longArray,
                                               int longIndex)
        Compare two arrays, starting at the specified offsets, but treating shortArray as a prefix to longArray. As long as shortArray is a prefix of longArray, return 0. Otherwise, behave as compareArray(char[], int, char[], int)
        Parameters:
        shortArray - prefix array
        shortIndex - offset into shortArray
        longArray - long array (word)
        longIndex - offset into longArray
        Returns:
        0 if shortArray is a prefix of longArray, otherwise act as compareArray(char[], int, char[], int)
      • getCharType

        public static int getCharType​(char ch)
        Return the internal CharType constant of a given character.
        Parameters:
        ch - input character
        Returns:
        constant from CharType describing the character type.
        See Also:
        CharType