org.apache.lucene.analysis.icu.segmentation
Class ICUTokenizerConfig

java.lang.Object
  extended by org.apache.lucene.analysis.icu.segmentation.ICUTokenizerConfig
Direct Known Subclasses:
DefaultICUTokenizerConfig

public abstract class ICUTokenizerConfig
extends Object

Class that allows for tailored Unicode Text Segmentation on a per-writing system basis.

WARNING: This API is experimental and might change in incompatible ways in the next release.

Constructor Summary
ICUTokenizerConfig()
          Sole constructor.
 
Method Summary
abstract  com.ibm.icu.text.BreakIterator getBreakIterator(int script)
          Return a breakiterator capable of processing a given script.
abstract  String getType(int script, int ruleStatus)
          Return a token type value for a given script and BreakIterator rule status.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ICUTokenizerConfig

public ICUTokenizerConfig()
Sole constructor. (For invocation by subclass constructors, typically implicit.)

Method Detail

getBreakIterator

public abstract com.ibm.icu.text.BreakIterator getBreakIterator(int script)
Return a breakiterator capable of processing a given script.


getType

public abstract String getType(int script,
                               int ruleStatus)
Return a token type value for a given script and BreakIterator rule status.



Copyright © 2000-2013 Apache Software Foundation. All Rights Reserved.