Uses of Interface
org.apache.lucene.analysis.standard.StandardTokenizerInterface

Packages that use StandardTokenizerInterface
org.apache.lucene.analysis.standard The org.apache.lucene.analysis.standard package contains three fast grammar-based tokenizers constructed with JFlex: 
org.apache.lucene.analysis.standard.std31   
 

Uses of StandardTokenizerInterface in org.apache.lucene.analysis.standard
 

Classes in org.apache.lucene.analysis.standard that implement StandardTokenizerInterface
 class StandardTokenizerImpl
          This class implements Word Break rules from the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29

Tokens produced are of the following types: <ALPHANUM>: A sequence of alphabetic and numeric characters <NUM>: A number <SOUTHEAST_ASIAN>: A sequence of characters from South and Southeast Asian languages, including Thai, Lao, Myanmar, and Khmer <IDEOGRAPHIC>: A single CJKV ideographic character <HIRAGANA>: A single hiragana character

 class UAX29URLEmailTokenizerImpl
          This class implements Word Break rules from the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29 URLs and email addresses are also tokenized according to the relevant RFCs.
 

Uses of StandardTokenizerInterface in org.apache.lucene.analysis.standard.std31
 

Classes in org.apache.lucene.analysis.standard.std31 that implement StandardTokenizerInterface
 class StandardTokenizerImpl31
          Deprecated. This class is only for exact backwards compatibility
 class UAX29URLEmailTokenizerImpl31
          Deprecated. This class is only for exact backwards compatibility
 



Copyright © 2000-2011 Apache Software Foundation. All Rights Reserved.