Class WordBreakTestUnicode_12_1_0
- java.lang.Object
-
- org.apache.lucene.tests.analysis.standard.WordBreakTestUnicode_12_1_0
-
public final class WordBreakTestUnicode_12_1_0 extends Object
This class was automatically generated by generateJavaUnicodeWordBreakTest.pl from: http://www.unicode.org/Public/12.1.0/ucd/auxiliary/WordBreakTest.txtWordBreakTest.txt indicates the points in the provided character sequences at which conforming implementations must and must not break words. This class tests for expected token extraction from each of the test sequences in WordBreakTest.txt, where the expected tokens are those character sequences bounded by word breaks and containing at least one character from one of the following character sets:
\p{Script = Han} (From http://www.unicode.org/Public/12.1.0/ucd/Scripts.txt) \p{Script = Hiragana} \p{LineBreak = Complex_Context} (From http://www.unicode.org/Public/12.1.0/ucd/LineBreak.txt) \p{WordBreak = ALetter} (From http://www.unicode.org/Public/12.1.0/ucd/auxiliary/WordBreakProperty.txt) \p{WordBreak = Hebrew_Letter} \p{WordBreak = Katakana} \p{WordBreak = Numeric} \p{Extended_Pictographic} (From http://www.unicode.org/Public/emoji/12.1/emoji-data.txt)
-
-
Constructor Summary
Constructors Constructor Description WordBreakTestUnicode_12_1_0()
-