org.apache.lucene.analysis.th
Class ThaiWordFilter
java.lang.Object
org.apache.lucene.util.AttributeSource
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.TokenFilter
org.apache.lucene.analysis.th.ThaiWordFilter
- All Implemented Interfaces:
- Closeable
public final class ThaiWordFilter
- extends TokenFilter
TokenFilter
that use BreakIterator
to break each
Token that is Thai into separate Token(s) for each Thai word.
Please note: Since matchVersion 3.1 on, this filter no longer lowercases non-thai text.
ThaiAnalyzer
will insert a LowerCaseFilter
before this filter
so the behaviour of the Analyzer does not change. With version 3.1, the filter handles
position increments correctly.
WARNING: this filter may not be supported by all JREs.
It is known to work with Sun/Oracle and Harmony JREs.
If your application needs to be fully portable, consider using ICUTokenizer instead,
which uses an ICU Thai BreakIterator that will always be available.
Field Summary |
static boolean |
DBBI_AVAILABLE
True if the JRE supports a working dictionary-based breakiterator for Thai. |
Methods inherited from class org.apache.lucene.util.AttributeSource |
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState |
DBBI_AVAILABLE
public static final boolean DBBI_AVAILABLE
- True if the JRE supports a working dictionary-based breakiterator for Thai.
If this is false, this filter will not work at all!
ThaiWordFilter
public ThaiWordFilter(Version matchVersion,
TokenStream input)
- Creates a new ThaiWordFilter with the specified match version.
incrementToken
public boolean incrementToken()
throws IOException
- Specified by:
incrementToken
in class TokenStream
- Throws:
IOException
reset
public void reset()
throws IOException
- Overrides:
reset
in class TokenFilter
- Throws:
IOException
Copyright © 2000-2013 Apache Software Foundation. All Rights Reserved.