Class ThaiTokenizer

All Implemented Interfaces:
Closeable, AutoCloseable

public class ThaiTokenizer extends SegmentingTokenizerBase
Tokenizer that use BreakIterator to tokenize Thai text.

WARNING: this tokenizer may not be supported by all JREs. It is known to work with Sun/Oracle and Harmony JREs. If your application needs to be fully portable, consider using ICUTokenizer instead, which uses an ICU Thai BreakIterator that will always be available.

  • Field Details

    • DBBI_AVAILABLE

      public static final boolean DBBI_AVAILABLE
      True if the JRE supports a working dictionary-based breakiterator for Thai. If this is false, this tokenizer will not work at all!
  • Constructor Details

    • ThaiTokenizer

      public ThaiTokenizer()
      Creates a new ThaiTokenizer
    • ThaiTokenizer

      public ThaiTokenizer(AttributeFactory factory)
      Creates a new ThaiTokenizer, supplying the AttributeFactory
  • Method Details