Package | Description |
---|---|
org.apache.lucene.analysis.standard |
Fast, general-purpose grammar-based tokenizers.
|
Class and Description |
---|
ClassicTokenizer
A grammar-based tokenizer constructed with JFlex
|
UAX29URLEmailTokenizer
This class implements Word Break rules from the Unicode Text Segmentation
algorithm, as specified in
Unicode Standard Annex #29
URLs and email addresses are also tokenized according to the relevant RFCs.
|
Copyright © 2000-2017 Apache Software Foundation. All Rights Reserved.