Class CharFilter

java.lang.Object
java.io.Reader
org.apache.lucene.analysis.CharFilter
All Implemented Interfaces:
Closeable, AutoCloseable, Readable

public abstract class CharFilter extends Reader
Subclasses of CharFilter can be chained to filter a Reader They can be used as Reader with additional offset correction. Tokenizers will automatically use correctOffset(int) if a CharFilter subclass is used.

This class is abstract: at a minimum you must implement Reader.read(char[], int, int), transforming the input in some way from input, and correct(int) to adjust the offsets to match the originals.

You can optionally provide more efficient implementations of additional methods like Reader.read(), Reader.read(char[]), Reader.read(java.nio.CharBuffer), but this is not required.

For examples and integration with Analyzer, see the Analysis package documentation.

  • Field Details

    • input

      protected final Reader input
      The underlying character-input stream.
  • Constructor Details

    • CharFilter

      public CharFilter(Reader input)
      Create a new CharFilter wrapping the provided reader.
      Parameters:
      input - a Reader, can also be a CharFilter for chaining.
  • Method Details

    • close

      public void close() throws IOException
      Closes the underlying input stream.

      NOTE: The default implementation closes the input Reader, so be sure to call super.close() when overriding this method.

      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
      Specified by:
      close in class Reader
      Throws:
      IOException
    • correct

      protected abstract int correct(int currentOff)
      Subclasses override to correct the current offset.
      Parameters:
      currentOff - current offset
      Returns:
      corrected offset
    • correctOffset

      public final int correctOffset(int currentOff)
      Chains the corrected offset through the input CharFilter(s).