Class CharStream

  extended by
      extended by org.apache.lucene.analysis.CharStream
All Implemented Interfaces:
Closeable, Readable
Direct Known Subclasses:
CharFilter, CharReader

public abstract class CharStream
extends Reader

CharStream adds correctOffset(int) functionality over Reader. All Tokenizers accept a CharStream instead of Reader as input, which enables arbitrary character based filtering before tokenization. The correctOffset(int) method fixed offsets to account for removal or insertion of characters, so that the offsets reported in the tokens match the character offsets of the original Reader.

Field Summary
Fields inherited from class
Constructor Summary
Method Summary
abstract  int correctOffset(int currentOff)
          Called by CharFilter(s) and Tokenizer to correct token offset.
Methods inherited from class
close, mark, markSupported, read, read, read, read, ready, reset, skip
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Detail


public CharStream()
Method Detail


public abstract int correctOffset(int currentOff)
Called by CharFilter(s) and Tokenizer to correct token offset.

currentOff - offset as seen in the output
corrected offset based on the input

Copyright © 2000-2011 Apache Software Foundation. All Rights Reserved.