Package org.apache.lucene.analysis

Support for testing analysis components.

The main classes of interest are:

  • BaseTokenStreamTestCase: Highly recommended to use its helper methods, (especially in conjunction with MockAnalyzer or MockTokenizer), as it contains many assertions and checks to catch bugs.
  • MockTokenizer: Tokenizer for testing. Tokenizer that serves as a replacement for WHITESPACE, SIMPLE, and KEYWORD tokenizers. If you are writing a component such as a TokenFilter, it's a great idea to test it wrapping this tokenizer instead for extra checks.
  • MockAnalyzer: Analyzer for testing. Analyzer that uses MockTokenizer for additional verification. If you are testing a custom component such as a queryparser or analyzer-wrapper that consumes analysis streams, it's a great idea to test it with this analyzer instead.