public final class MockAnalyzer extends Analyzer
This analyzer is a replacement for Whitespace/Simple/KeywordAnalyzers for unit tests. If you are testing a custom component such as a queryparser or analyzer-wrapper that consumes analysis streams, it's a great idea to test it with this analyzer instead. MockAnalyzer has the following behavior:
MockTokenizer are turned on for extra
checks that the consumer is consuming properly. These checks can be disabled
with setEnableChecks(boolean).
MockTokenizerAnalyzer.ReuseStrategy, Analyzer.TokenStreamComponentsGLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY| Constructor and Description |
|---|
MockAnalyzer(Random random)
Create a Whitespace-lowercasing analyzer with no stopwords removal.
|
MockAnalyzer(Random random,
CharacterRunAutomaton runAutomaton,
boolean lowerCase)
|
MockAnalyzer(Random random,
CharacterRunAutomaton runAutomaton,
boolean lowerCase,
CharacterRunAutomaton filter)
Creates a new MockAnalyzer.
|
| Modifier and Type | Method and Description |
|---|---|
Analyzer.TokenStreamComponents |
createComponents(String fieldName) |
int |
getOffsetGap(String fieldName)
Get the offset gap between tokens in fields if several fields with the same name were added.
|
int |
getPositionIncrementGap(String fieldName) |
void |
setEnableChecks(boolean enableChecks)
Toggle consumer workflow checking: if your test consumes tokenstreams normally you
should leave this enabled.
|
void |
setMaxTokenLength(int length)
Toggle maxTokenLength for MockTokenizer
|
void |
setOffsetGap(int offsetGap)
Set a new offset gap which will then be added to the offset when several fields with the same name are indexed
|
void |
setPositionIncrementGap(int positionIncrementGap) |
close, getReuseStrategy, getVersion, initReader, setVersion, tokenStream, tokenStreampublic MockAnalyzer(Random random, CharacterRunAutomaton runAutomaton, boolean lowerCase, CharacterRunAutomaton filter)
random - Random for payloads behaviorrunAutomaton - DFA describing how tokenization should happen (e.g. [a-zA-Z]+)lowerCase - true if the tokenizer should lowercase termsfilter - DFA describing how terms should be filtered (set of stopwords, etc)public MockAnalyzer(Random random, CharacterRunAutomaton runAutomaton, boolean lowerCase)
public MockAnalyzer(Random random)
Calls MockAnalyzer(random, MockTokenizer.WHITESPACE, true, MockTokenFilter.EMPTY_STOPSET, false).
public Analyzer.TokenStreamComponents createComponents(String fieldName)
createComponents in class Analyzerpublic void setPositionIncrementGap(int positionIncrementGap)
public int getPositionIncrementGap(String fieldName)
getPositionIncrementGap in class Analyzerpublic void setOffsetGap(int offsetGap)
offsetGap - The offset gap that should be used.public int getOffsetGap(String fieldName)
getOffsetGap in class AnalyzerfieldName - Currently not used, the same offset gap is returned for each field.public void setEnableChecks(boolean enableChecks)
public void setMaxTokenLength(int length)
Copyright © 2000-2016 Apache Software Foundation. All Rights Reserved.