public final class ConcatenateGraphFilter extends TokenStream
Modifier and Type | Class and Description |
---|---|
static interface |
ConcatenateGraphFilter.BytesRefBuilderTermAttribute
Attribute providing access to the term builder and UTF-16 conversion
|
static class |
ConcatenateGraphFilter.BytesRefBuilderTermAttributeImpl
Implementation of
ConcatenateGraphFilter.BytesRefBuilderTermAttribute |
AttributeSource.State
Modifier and Type | Field and Description |
---|---|
static int |
DEFAULT_MAX_GRAPH_EXPANSIONS |
static boolean |
DEFAULT_PRESERVE_POSITION_INCREMENTS |
static boolean |
DEFAULT_PRESERVE_SEP |
static Character |
DEFAULT_TOKEN_SEPARATOR |
static int |
SEP_LABEL
Represents the default separator between tokens.
|
DEFAULT_TOKEN_ATTRIBUTE_FACTORY
Constructor and Description |
---|
ConcatenateGraphFilter(TokenStream inputTokenStream)
Creates a token stream to convert
input to a token stream
of accepted strings by its token stream graph. |
ConcatenateGraphFilter(TokenStream inputTokenStream,
boolean preserveSep,
boolean preservePositionIncrements,
int maxGraphExpansions)
|
ConcatenateGraphFilter(TokenStream inputTokenStream,
Character tokenSeparator,
boolean preservePositionIncrements,
int maxGraphExpansions)
Creates a token stream to convert
input to a token stream
of accepted strings by its token stream graph. |
Modifier and Type | Method and Description |
---|---|
void |
close() |
void |
end() |
boolean |
incrementToken() |
void |
reset() |
Automaton |
toAutomaton()
Converts the tokenStream to an automaton, treating the transition labels as utf-8.
|
Automaton |
toAutomaton(boolean unicodeAware)
Converts the tokenStream to an automaton.
|
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toString
public static final int SEP_LABEL
public static final int DEFAULT_MAX_GRAPH_EXPANSIONS
public static final Character DEFAULT_TOKEN_SEPARATOR
public static final boolean DEFAULT_PRESERVE_SEP
public static final boolean DEFAULT_PRESERVE_POSITION_INCREMENTS
public ConcatenateGraphFilter(TokenStream inputTokenStream)
input
to a token stream
of accepted strings by its token stream graph.
This constructor uses the default settings of the constants in this class.
public ConcatenateGraphFilter(TokenStream inputTokenStream, Character tokenSeparator, boolean preservePositionIncrements, int maxGraphExpansions)
input
to a token stream
of accepted strings by its token stream graph.inputTokenStream
- The input/incoming TokenStreamtokenSeparator
- Separator to use for concatenation. Can be null, in this case tokens will be concatenated
without any separators.preservePositionIncrements
- Whether to add an empty token for missing positions.
The effect is a consecutive SEP_LABEL
.
When false, it's as if there were no missing positions
(we pretend the surrounding tokens were adjacent).maxGraphExpansions
- If the tokenStream graph has more than this many possible paths through, then we'll throw
TooComplexToDeterminizeException
to preserve the stability and memory of the
machine.TooComplexToDeterminizeException
- if the tokenStream graph has more than maxGraphExpansions
expansionspublic ConcatenateGraphFilter(TokenStream inputTokenStream, boolean preserveSep, boolean preservePositionIncrements, int maxGraphExpansions)
ConcatenateGraphFilter(org.apache.lucene.analysis.TokenStream, java.lang.Character, boolean, int)
preserveSep
- Whether SEP_LABEL
should separate the input tokens in the concatenated tokenpublic void reset() throws IOException
reset
in class TokenStream
IOException
public boolean incrementToken() throws IOException
incrementToken
in class TokenStream
IOException
public void end() throws IOException
end
in class TokenStream
IOException
public void close() throws IOException
close
in interface Closeable
close
in interface AutoCloseable
close
in class TokenStream
IOException
public Automaton toAutomaton() throws IOException
IOException
public Automaton toAutomaton(boolean unicodeAware) throws IOException
IOException
Copyright © 2000-2024 Apache Software Foundation. All Rights Reserved.