Class SimpleQueryParser
The main idea behind this parser is that a person should be able to type whatever they want to represent a query, and this parser will do its best to interpret what to search for no matter how poorly composed the request may be. Tokens are considered to be any of a term, phrase, or subquery for the operations described below. Whitespace including ' ' '\n' '\r' and '\t' and certain operators may be used to delimit tokens ( ) + | " .
Any errors in query syntax will be ignored and the parser will attempt to decipher what it can; however, this may mean odd or unexpected results.
Query Operators
- '
+
' specifiesAND
operation:token1+token2
- '
|
' specifiesOR
operation:token1|token2
- '
-
' negates a single token:-token0
- '
"
' creates phrases of terms:"term1 term2 ..."
- '
*
' at the end of terms specifies prefix query:term*
- '
~
N' at the end of terms specifies fuzzy query:term~1
- '
~
N' at the end of phrases specifies near query:"term1 term2"~5
- '
(
' and ')
' specifies precedence:token1 + (token2 | token3)
The default operator
is OR
if no other operator is
specified. For example, the following will OR
token1
and token2
together:
token1 token2
Normal operator precedence will be simple order from right to left. For example, the following
will evaluate token1 OR token2
first, then AND
with token3
:
token1 | token2 + token3Escaping
An individual term may contain any possible character with certain characters requiring
escaping using a '\
'. The following characters will need to be escaped in terms and
phrases: + | " ( ) ' \
The '-
' operator is a special case. On individual terms (not phrases) the first
character of a term that is -
must be escaped; however, any '-
' characters beyond
the first character do not need to be escaped. For example:
-term1
-- SpecifiesNOT
operation againstterm1
\-term1
-- Searches for the term-term1
.term-1
-- Searches for the termterm-1
.term\-1
-- Searches for the termterm-1
.
The '*
' operator is a special case. On individual terms (not phrases) the last
character of a term that is '*
' must be escaped; however, any '*
' characters
before the last character do not need to be escaped:
term1*
-- Searches for the prefixterm1
term1\*
-- Searches for the termterm1*
term*1
-- Searches for the termterm*1
term\*1
-- Searches for the termterm*1
Note that above examples consider the terms before text processing.
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.lucene.util.QueryBuilder
QueryBuilder.TermAndBoost
-
Field Summary
Modifier and TypeFieldDescriptionstatic final int
EnablesAND
operator (+)static final int
EnablesESCAPE
operator (\)protected final int
flags to the parser (to turn features on/off)static final int
EnablesFUZZY
operators: (~) on single termsstatic final int
EnablesNEAR
operators: (~) on phrasesstatic final int
EnablesNOT
operator (-)static final int
EnablesOR
operator (|)static final int
EnablesPHRASE
operator (")static final int
EnablesPRECEDENCE
operators:(
and)
static final int
EnablesPREFIX
operator (*)Map of fields to query against with their weightsstatic final int
EnablesWHITESPACE
operators: ' ' '\n' '\r' '\t'Fields inherited from class org.apache.lucene.util.QueryBuilder
analyzer, autoGenerateMultiTermSynonymsPhraseQuery, enableGraphQueries, enablePositionIncrements
-
Constructor Summary
ConstructorDescriptionSimpleQueryParser
(Analyzer analyzer, String field) Creates a new parser searching over a single field.SimpleQueryParser
(Analyzer analyzer, Map<String, Float> weights) Creates a new parser searching over multiple fields with different weights.SimpleQueryParser
(Analyzer analyzer, Map<String, Float> weights, int flags) Creates a new parser with custom flags used to enable/disable certain features. -
Method Summary
Modifier and TypeMethodDescriptionReturns the implicit operator setting, which will be eitherSHOULD
orMUST
.protected Query
newDefaultQuery
(String text) Factory method to generate a standard query (no phrase or prefix operators).protected Query
newFuzzyQuery
(String text, int fuzziness) Factory method to generate a fuzzy query.protected Query
newPhraseQuery
(String text, int slop) Factory method to generate a phrase query with slop.protected Query
newPrefixQuery
(String text) Factory method to generate a prefix query.Parses the query text and returns parsed queryvoid
setDefaultOperator
(BooleanClause.Occur operator) Sets the implicit operator setting, which must be eitherSHOULD
orMUST
.protected Query
simplify
(BooleanQuery bq) Helper to simplify boolean queries with 0 or 1 clauseMethods inherited from class org.apache.lucene.util.QueryBuilder
add, analyzeBoolean, analyzeGraphBoolean, analyzeGraphPhrase, analyzeMultiBoolean, analyzeMultiPhrase, analyzePhrase, analyzeTerm, createBooleanQuery, createBooleanQuery, createFieldQuery, createFieldQuery, createMinShouldMatchQuery, createPhraseQuery, createPhraseQuery, getAnalyzer, getAutoGenerateMultiTermSynonymsPhraseQuery, getEnableGraphQueries, getEnablePositionIncrements, newBooleanQuery, newGraphSynonymQuery, newMultiPhraseQueryBuilder, newSynonymQuery, newTermQuery, setAnalyzer, setAutoGenerateMultiTermSynonymsPhraseQuery, setEnableGraphQueries, setEnablePositionIncrements
-
Field Details
-
weights
Map of fields to query against with their weights -
flags
protected final int flagsflags to the parser (to turn features on/off) -
AND_OPERATOR
public static final int AND_OPERATOREnablesAND
operator (+)- See Also:
-
NOT_OPERATOR
public static final int NOT_OPERATOREnablesNOT
operator (-)- See Also:
-
OR_OPERATOR
public static final int OR_OPERATOREnablesOR
operator (|)- See Also:
-
PREFIX_OPERATOR
public static final int PREFIX_OPERATOREnablesPREFIX
operator (*)- See Also:
-
PHRASE_OPERATOR
public static final int PHRASE_OPERATOREnablesPHRASE
operator (")- See Also:
-
PRECEDENCE_OPERATORS
public static final int PRECEDENCE_OPERATORSEnablesPRECEDENCE
operators:(
and)
- See Also:
-
ESCAPE_OPERATOR
public static final int ESCAPE_OPERATOREnablesESCAPE
operator (\)- See Also:
-
WHITESPACE_OPERATOR
public static final int WHITESPACE_OPERATOREnablesWHITESPACE
operators: ' ' '\n' '\r' '\t'- See Also:
-
FUZZY_OPERATOR
public static final int FUZZY_OPERATOREnablesFUZZY
operators: (~) on single terms- See Also:
-
NEAR_OPERATOR
public static final int NEAR_OPERATOREnablesNEAR
operators: (~) on phrases- See Also:
-
-
Constructor Details
-
SimpleQueryParser
Creates a new parser searching over a single field. -
SimpleQueryParser
Creates a new parser searching over multiple fields with different weights. -
SimpleQueryParser
Creates a new parser with custom flags used to enable/disable certain features.
-
-
Method Details
-
parse
Parses the query text and returns parsed query -
newDefaultQuery
Factory method to generate a standard query (no phrase or prefix operators). -
newFuzzyQuery
Factory method to generate a fuzzy query. -
newPhraseQuery
Factory method to generate a phrase query with slop. -
newPrefixQuery
Factory method to generate a prefix query. -
simplify
Helper to simplify boolean queries with 0 or 1 clause -
getDefaultOperator
Returns the implicit operator setting, which will be eitherSHOULD
orMUST
. -
setDefaultOperator
Sets the implicit operator setting, which must be eitherSHOULD
orMUST
.
-