org.apache.lucene.tests.analysis.MockAnalyzer

All Implemented Interfaces:: Closeable, AutoCloseable

public final class MockAnalyzer extends Analyzer

Analyzer for testing

This analyzer is a replacement for Whitespace/Simple/KeywordAnalyzers for unit tests. If you are testing a custom component such as a queryparser or analyzer-wrapper that consumes analysis streams, it's a great idea to test it with this analyzer instead. MockAnalyzer has the following behavior:

By default, the assertions in MockTokenizer are turned on for extra checks that the consumer is consuming properly. These checks can be disabled with setEnableChecks(boolean).
Payload data is randomly injected into the stream for more thorough testing of payloads.

See Also:

MockTokenizer

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.lucene.analysis.Analyzer
Analyzer.ReuseStrategy, Analyzer.TokenStreamComponents
Field Summary

Fields inherited from class org.apache.lucene.analysis.Analyzer
GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY
Constructor Summary

Constructors

Constructor

Description

MockAnalyzer(Random random)

Create a Whitespace-lowercasing analyzer with no stopwords removal.

MockAnalyzer(Random random, CharacterRunAutomaton runAutomaton, boolean lowerCase)

Calls MockAnalyzer(random, runAutomaton, lowerCase, MockTokenFilter.EMPTY_STOPSET, false).

MockAnalyzer(Random random, CharacterRunAutomaton runAutomaton, boolean lowerCase, CharacterRunAutomaton filter)

Creates a new MockAnalyzer.
Method Summary

Modifier and Type

Method

Description

Analyzer.TokenStreamComponents

createComponents(String fieldName)

int

getOffsetGap(String fieldName)

Get the offset gap between tokens in fields if several fields with the same name were added.

int

getPositionIncrementGap(String fieldName)

protected TokenStream

normalize(String fieldName, TokenStream in)

void

setEnableChecks(boolean enableChecks)

Toggle consumer workflow checking: if your test consumes tokenstreams normally you should leave this enabled.

void

setMaxTokenLength(int length)

Toggle maxTokenLength for MockTokenizer

void

setOffsetGap(int offsetGap)

Set a new offset gap which will then be added to the offset when several fields with the same name are indexed

void

setPositionIncrementGap(int positionIncrementGap)

Methods inherited from class org.apache.lucene.analysis.Analyzer
attributeFactory, close, getReuseStrategy, initReader, initReaderForNormalization, normalize, tokenStream, tokenStream

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- MockAnalyzer
  
  public MockAnalyzer(Random random, CharacterRunAutomaton runAutomaton, boolean lowerCase, CharacterRunAutomaton filter)
  
  Creates a new MockAnalyzer.
  
  Parameters:
  
  random - Random for payloads behavior
  
  runAutomaton - DFA describing how tokenization should happen (e.g. [a-zA-Z]+)
  
  lowerCase - true if the tokenizer should lowercase terms
  
  filter - DFA describing how terms should be filtered (set of stopwords, etc)
- MockAnalyzer
  
  public MockAnalyzer(Random random, CharacterRunAutomaton runAutomaton, boolean lowerCase)
  
  Calls MockAnalyzer(random, runAutomaton, lowerCase, MockTokenFilter.EMPTY_STOPSET, false).
- MockAnalyzer
  
  public MockAnalyzer(Random random)
  
  Create a Whitespace-lowercasing analyzer with no stopwords removal.
  Calls MockAnalyzer(random, MockTokenizer.WHITESPACE, true, MockTokenFilter.EMPTY_STOPSET, false).
Method Details
- createComponents
  
  public Analyzer.TokenStreamComponents createComponents(String fieldName)
  
  Specified by:
  
  createComponents in class Analyzer
- normalize
  
  protected TokenStream normalize(String fieldName, TokenStream in)
  
  Overrides:
  
  normalize in class Analyzer
- setPositionIncrementGap
  
  public void setPositionIncrementGap(int positionIncrementGap)
- getPositionIncrementGap
  
  public int getPositionIncrementGap(String fieldName)
  
  Overrides:
  
  getPositionIncrementGap in class Analyzer
- setOffsetGap
  
  public void setOffsetGap(int offsetGap)
  
  Set a new offset gap which will then be added to the offset when several fields with the same name are indexed
  
  Parameters:
  
  offsetGap - The offset gap that should be used.
- getOffsetGap
  
  public int getOffsetGap(String fieldName)
  
  Get the offset gap between tokens in fields if several fields with the same name were added.
  
  Overrides:
  
  getOffsetGap in class Analyzer
  
  Parameters:
  
  fieldName - Currently not used, the same offset gap is returned for each field.
- setEnableChecks
  
  public void setEnableChecks(boolean enableChecks)
  
  Toggle consumer workflow checking: if your test consumes tokenstreams normally you should leave this enabled.
- setMaxTokenLength
  
  public void setMaxTokenLength(int length)
  
  Toggle maxTokenLength for MockTokenizer

Class MockAnalyzer

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.lucene.analysis.Analyzer

Field Summary

Fields inherited from class org.apache.lucene.analysis.Analyzer

Constructor Summary

Method Summary

Methods inherited from class org.apache.lucene.analysis.Analyzer

Methods inherited from class java.lang.Object

Constructor Details

MockAnalyzer

MockAnalyzer

MockAnalyzer

Method Details

createComponents

normalize

setPositionIncrementGap

getPositionIncrementGap

setOffsetGap

getOffsetGap

setEnableChecks

setMaxTokenLength