org.apache.lucene.analysis.cn.smart
Class SentenceTokenizer

java.lang.Object
  extended by org.apache.lucene.util.AttributeSource
      extended by org.apache.lucene.analysis.TokenStream
          extended by org.apache.lucene.analysis.Tokenizer
              extended by org.apache.lucene.analysis.cn.smart.SentenceTokenizer
All Implemented Interfaces:
Closeable

public final class SentenceTokenizer
extends org.apache.lucene.analysis.Tokenizer

Tokenizes input text into sentences.

The output tokens can then be broken into words with WordTokenFilter

WARNING: This API is experimental and might change in incompatible ways in the next release.

Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource
org.apache.lucene.util.AttributeSource.AttributeFactory, org.apache.lucene.util.AttributeSource.State
 
Field Summary
 
Fields inherited from class org.apache.lucene.analysis.Tokenizer
input
 
Constructor Summary
SentenceTokenizer(org.apache.lucene.util.AttributeSource.AttributeFactory factory, Reader reader)
           
SentenceTokenizer(org.apache.lucene.util.AttributeSource source, Reader reader)
           
SentenceTokenizer(Reader reader)
           
 
Method Summary
 void end()
           
 boolean incrementToken()
           
 void reset()
           
 void reset(Reader input)
           
 
Methods inherited from class org.apache.lucene.analysis.Tokenizer
close, correctOffset
 
Methods inherited from class org.apache.lucene.util.AttributeSource
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Constructor Detail

SentenceTokenizer

public SentenceTokenizer(Reader reader)

SentenceTokenizer

public SentenceTokenizer(org.apache.lucene.util.AttributeSource source,
                         Reader reader)

SentenceTokenizer

public SentenceTokenizer(org.apache.lucene.util.AttributeSource.AttributeFactory factory,
                         Reader reader)
Method Detail

incrementToken

public boolean incrementToken()
                       throws IOException
Specified by:
incrementToken in class org.apache.lucene.analysis.TokenStream
Throws:
IOException

reset

public void reset()
           throws IOException
Overrides:
reset in class org.apache.lucene.analysis.TokenStream
Throws:
IOException

reset

public void reset(Reader input)
           throws IOException
Overrides:
reset in class org.apache.lucene.analysis.Tokenizer
Throws:
IOException

end

public void end()
         throws IOException
Overrides:
end in class org.apache.lucene.analysis.TokenStream
Throws:
IOException


Copyright © 2000-2011 Apache Software Foundation. All Rights Reserved.