Package org.apache.lucene.analysis.util
Utility functions for text analysis.
-
Interface Summary Interface Description ResourceLoader Abstraction for loading resources (streams, files, and classes).ResourceLoaderAware Interface for a component that needs to be initialized by an implementation ofResourceLoader. -
Class Summary Class Description AbstractAnalysisFactory Abstract parent class for analysis factoriesTokenizerFactory,TokenFilterFactoryandCharFilterFactory.AnalysisSPILoader<S extends AbstractAnalysisFactory> Helper class for loading named SPIs from classpath (e.g.CharArrayIterator A CharacterIterator used internally for use withBreakIteratorCharFilterFactory Abstract parent class for analysis factories that createCharFilterinstances.CharTokenizer An abstract base class for simple, character-oriented tokenizers.ClasspathResourceLoader SimpleResourceLoaderthat usesClassLoader.getResourceAsStream(String)andClass.forName(String,boolean,ClassLoader)to open resources and classes, respectively.ElisionFilter Removes elisions from aTokenStream.ElisionFilterFactory Factory forElisionFilter.FilesystemResourceLoader SimpleResourceLoaderthat opens resource files from the local file system, optionally resolving against a base directory.OpenStringBuilder A StringBuilder that allows one to access the array.RollingCharBuffer Acts like a forever growing char[] as you read characters into it from the provided reader, but internally it uses a circular buffer to only hold the characters that haven't been freed yet.SegmentingTokenizerBase Breaks text into sentences with aBreakIteratorand allows subclasses to decompose these sentences into words.StemmerUtil Some commonly-used stemming functionsTokenFilterFactory Abstract parent class for analysis factories that createTokenFilterinstances.TokenizerFactory Abstract parent class for analysis factories that createTokenizerinstances.UnicodeProps This file contains unicode properties used by variousCharTokenizers.