Class UAX29URLEmailTokenizer

  • All Implemented Interfaces:
    Closeable, AutoCloseable

    public final class UAX29URLEmailTokenizer
    extends Tokenizer
    This class implements Word Break rules from the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29 URLs and email addresses are also tokenized according to the relevant RFCs.