Class ThaiTokenizer

  • All Implemented Interfaces:
    Closeable, AutoCloseable

    public class ThaiTokenizer
    extends SegmentingTokenizerBase
    Tokenizer that use BreakIterator to tokenize Thai text.

    WARNING: this tokenizer may not be supported by all JREs. It is known to work with Sun/Oracle and Harmony JREs. If your application needs to be fully portable, consider using ICUTokenizer instead, which uses an ICU Thai BreakIterator that will always be available.

        public static final boolean DBBI_AVAILABLE
        True if the JRE supports a working dictionary-based breakiterator for Thai. If this is false, this tokenizer will not work at all!
        public ThaiTokenizer()
        Creates a new ThaiTokenizer
        public ThaiTokenizer​(AttributeFactory factory)
        Creates a new ThaiTokenizer, supplying the AttributeFactory