public final class MockAnalyzer extends Analyzer
This analyzer is a replacement for Whitespace/Simple/KeywordAnalyzers for unit tests. If you are testing a custom component such as a queryparser or analyzer-wrapper that consumes analysis streams, it's a great idea to test it with this analyzer instead. MockAnalyzer has the following behavior:
MockTokenizer
are turned on for extra
checks that the consumer is consuming properly. These checks can be disabled
with setEnableChecks(boolean)
.
MockTokenizer
Analyzer.ReuseStrategy, Analyzer.TokenStreamComponents
GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY
Constructor and Description |
---|
MockAnalyzer(Random random)
Create a Whitespace-lowercasing analyzer with no stopwords removal.
|
MockAnalyzer(Random random,
CharacterRunAutomaton runAutomaton,
boolean lowerCase)
|
MockAnalyzer(Random random,
CharacterRunAutomaton runAutomaton,
boolean lowerCase,
CharacterRunAutomaton filter)
Creates a new MockAnalyzer.
|
Modifier and Type | Method and Description |
---|---|
Analyzer.TokenStreamComponents |
createComponents(String fieldName) |
int |
getOffsetGap(String fieldName)
Get the offset gap between tokens in fields if several fields with the same name were added.
|
int |
getPositionIncrementGap(String fieldName) |
void |
setEnableChecks(boolean enableChecks)
Toggle consumer workflow checking: if your test consumes tokenstreams normally you
should leave this enabled.
|
void |
setMaxTokenLength(int length)
Toggle maxTokenLength for MockTokenizer
|
void |
setOffsetGap(int offsetGap)
Set a new offset gap which will then be added to the offset when several fields with the same name are indexed
|
void |
setPositionIncrementGap(int positionIncrementGap) |
close, getReuseStrategy, getVersion, initReader, setVersion, tokenStream, tokenStream
public MockAnalyzer(Random random, CharacterRunAutomaton runAutomaton, boolean lowerCase, CharacterRunAutomaton filter)
random
- Random for payloads behaviorrunAutomaton
- DFA describing how tokenization should happen (e.g. [a-zA-Z]+)lowerCase
- true if the tokenizer should lowercase termsfilter
- DFA describing how terms should be filtered (set of stopwords, etc)public MockAnalyzer(Random random, CharacterRunAutomaton runAutomaton, boolean lowerCase)
public MockAnalyzer(Random random)
Calls MockAnalyzer(random, MockTokenizer.WHITESPACE, true, MockTokenFilter.EMPTY_STOPSET, false
).
public Analyzer.TokenStreamComponents createComponents(String fieldName)
createComponents
in class Analyzer
public void setPositionIncrementGap(int positionIncrementGap)
public int getPositionIncrementGap(String fieldName)
getPositionIncrementGap
in class Analyzer
public void setOffsetGap(int offsetGap)
offsetGap
- The offset gap that should be used.public int getOffsetGap(String fieldName)
getOffsetGap
in class Analyzer
fieldName
- Currently not used, the same offset gap is returned for each field.public void setEnableChecks(boolean enableChecks)
public void setMaxTokenLength(int length)
Copyright © 2000-2016 Apache Software Foundation. All Rights Reserved.