Class WordBreakTestUnicode_9_0_0


  • public class WordBreakTestUnicode_9_0_0
    extends BaseTokenStreamTestCase
    This class was automatically generated by generateJavaUnicodeWordBreakTest.pl from: http://www.unicode.org/Public/9.0.0/ucd/auxiliary/WordBreakTest.txt WordBreakTest.txt indicates the points in the provided character sequences at which conforming implementations must and must not break words. This class tests for expected token extraction from each of the test sequences in WordBreakTest.txt, where the expected tokens are those character sequences bounded by word breaks and containing at least one character from one of the following character sets: \p{Script = Han} (From http://www.unicode.org/Public/9.0.0/ucd/Scripts.txt) \p{Script = Hiragana} \p{LineBreak = Complex_Context} (From http://www.unicode.org/Public/9.0.0/ucd/LineBreak.txt) \p{WordBreak = ALetter} (From http://www.unicode.org/Public/9.0.0/ucd/auxiliary/WordBreakProperty.txt) \p{WordBreak = Hebrew_Letter} \p{WordBreak = Katakana} \p{WordBreak = Numeric} (Excludes full-width Arabic digits) [0-9] (Full-width Arabic digits)
    • Constructor Detail

      • WordBreakTestUnicode_9_0_0

        public WordBreakTestUnicode_9_0_0()