Package org.apache.lucene.tests.analysis
Class MockAnalyzer
- java.lang.Object
-
- org.apache.lucene.analysis.Analyzer
-
- org.apache.lucene.tests.analysis.MockAnalyzer
-
- All Implemented Interfaces:
Closeable
,AutoCloseable
public final class MockAnalyzer extends Analyzer
Analyzer for testingThis analyzer is a replacement for Whitespace/Simple/KeywordAnalyzers for unit tests. If you are testing a custom component such as a queryparser or analyzer-wrapper that consumes analysis streams, it's a great idea to test it with this analyzer instead. MockAnalyzer has the following behavior:
- By default, the assertions in
MockTokenizer
are turned on for extra checks that the consumer is consuming properly. These checks can be disabled withsetEnableChecks(boolean)
. - Payload data is randomly injected into the stream for more thorough testing of payloads.
- See Also:
MockTokenizer
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.analysis.Analyzer
Analyzer.ReuseStrategy, Analyzer.TokenStreamComponents
-
-
Field Summary
-
Fields inherited from class org.apache.lucene.analysis.Analyzer
GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY
-
-
Constructor Summary
Constructors Constructor Description MockAnalyzer(Random random)
Create a Whitespace-lowercasing analyzer with no stopwords removal.MockAnalyzer(Random random, CharacterRunAutomaton runAutomaton, boolean lowerCase)
MockAnalyzer(Random random, CharacterRunAutomaton runAutomaton, boolean lowerCase, CharacterRunAutomaton filter)
Creates a new MockAnalyzer.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Analyzer.TokenStreamComponents
createComponents(String fieldName)
int
getOffsetGap(String fieldName)
Get the offset gap between tokens in fields if several fields with the same name were added.int
getPositionIncrementGap(String fieldName)
protected TokenStream
normalize(String fieldName, TokenStream in)
void
setEnableChecks(boolean enableChecks)
Toggle consumer workflow checking: if your test consumes tokenstreams normally you should leave this enabled.void
setMaxTokenLength(int length)
Toggle maxTokenLength for MockTokenizervoid
setOffsetGap(int offsetGap)
Set a new offset gap which will then be added to the offset when several fields with the same name are indexedvoid
setPositionIncrementGap(int positionIncrementGap)
-
Methods inherited from class org.apache.lucene.analysis.Analyzer
attributeFactory, close, getReuseStrategy, initReader, initReaderForNormalization, normalize, tokenStream, tokenStream
-
-
-
-
Constructor Detail
-
MockAnalyzer
public MockAnalyzer(Random random, CharacterRunAutomaton runAutomaton, boolean lowerCase, CharacterRunAutomaton filter)
Creates a new MockAnalyzer.- Parameters:
random
- Random for payloads behaviorrunAutomaton
- DFA describing how tokenization should happen (e.g. [a-zA-Z]+)lowerCase
- true if the tokenizer should lowercase termsfilter
- DFA describing how terms should be filtered (set of stopwords, etc)
-
MockAnalyzer
public MockAnalyzer(Random random, CharacterRunAutomaton runAutomaton, boolean lowerCase)
-
MockAnalyzer
public MockAnalyzer(Random random)
Create a Whitespace-lowercasing analyzer with no stopwords removal.Calls
MockAnalyzer(random, MockTokenizer.WHITESPACE, true, MockTokenFilter.EMPTY_STOPSET, false
).
-
-
Method Detail
-
createComponents
public Analyzer.TokenStreamComponents createComponents(String fieldName)
- Specified by:
createComponents
in classAnalyzer
-
normalize
protected TokenStream normalize(String fieldName, TokenStream in)
-
setPositionIncrementGap
public void setPositionIncrementGap(int positionIncrementGap)
-
getPositionIncrementGap
public int getPositionIncrementGap(String fieldName)
- Overrides:
getPositionIncrementGap
in classAnalyzer
-
setOffsetGap
public void setOffsetGap(int offsetGap)
Set a new offset gap which will then be added to the offset when several fields with the same name are indexed- Parameters:
offsetGap
- The offset gap that should be used.
-
getOffsetGap
public int getOffsetGap(String fieldName)
Get the offset gap between tokens in fields if several fields with the same name were added.- Overrides:
getOffsetGap
in classAnalyzer
- Parameters:
fieldName
- Currently not used, the same offset gap is returned for each field.
-
setEnableChecks
public void setEnableChecks(boolean enableChecks)
Toggle consumer workflow checking: if your test consumes tokenstreams normally you should leave this enabled.
-
setMaxTokenLength
public void setMaxTokenLength(int length)
Toggle maxTokenLength for MockTokenizer
-
-