public class LaoBreakIterator
extends com.ibm.icu.text.BreakIterator
This breaks Lao text into syllables according to: Syllabification of Lao Script for Line Breaking Phonpasit Phissamay, Valaxay Dalolay, Chitaphone Chanhsililath, Oulaiphone Silimasak, Sarmad Hussain, Nadir Durrani, Science Technology and Environment Agency, CRULP.
Most work is accomplished with RBBI rules, however some additional special logic is needed that cannot be coded in a grammar, and this is implemented here.
For example, what appears to be a final consonant might instead be part of the next syllable. Rules match in a greedy fashion, leaving an illegal sequence that matches no rules.
Take for instance the text ກວ່າດອກ The first rule greedily matches ກວ່າດ, but then ອກ is encountered, which is illegal. What LaoBreakIterator does, according to the paper:
Finally, LaoBreakIterator also takes care of the second concern mentioned in the paper. This is the issue of combining marks being in the wrong order (typos).
Constructor and Description |
---|
LaoBreakIterator(com.ibm.icu.text.RuleBasedBreakIterator rules) |
Modifier and Type | Method and Description |
---|---|
Object |
clone()
Clone method.
|
int |
current() |
int |
first() |
int |
following(int offset) |
CharacterIterator |
getText() |
int |
last() |
int |
next() |
int |
next(int n) |
int |
previous() |
void |
setText(CharacterIterator text) |
void |
setText(String newText) |
getAvailableLocales, getAvailableULocales, getBreakInstance, getCharacterInstance, getCharacterInstance, getCharacterInstance, getLineInstance, getLineInstance, getLineInstance, getLocale, getSentenceInstance, getSentenceInstance, getSentenceInstance, getTitleInstance, getTitleInstance, getTitleInstance, getWordInstance, getWordInstance, getWordInstance, isBoundary, preceding, registerInstance, registerInstance, unregister
public LaoBreakIterator(com.ibm.icu.text.RuleBasedBreakIterator rules)
public int current()
current
in class com.ibm.icu.text.BreakIterator
public int first()
first
in class com.ibm.icu.text.BreakIterator
public int following(int offset)
following
in class com.ibm.icu.text.BreakIterator
public CharacterIterator getText()
getText
in class com.ibm.icu.text.BreakIterator
public int last()
last
in class com.ibm.icu.text.BreakIterator
public int next()
next
in class com.ibm.icu.text.BreakIterator
public int next(int n)
next
in class com.ibm.icu.text.BreakIterator
public int previous()
previous
in class com.ibm.icu.text.BreakIterator
public void setText(CharacterIterator text)
setText
in class com.ibm.icu.text.BreakIterator
public void setText(String newText)
setText
in class com.ibm.icu.text.BreakIterator
public Object clone()
clone
in class com.ibm.icu.text.BreakIterator