public class MultipassTermFilteredPresearcher extends TermFilteredPresearcher
Incoming documents are then converted to a set of Disjunction queries over each suffixed field, and these queries are combined into a conjunction query, such that the document's set of terms must match a term from each route.
This allows filtering out of documents that contain one half of a two-term phrase query, for
example. The query "hello world"
will be indexed twice, once under 'hello' and once
under 'world'. A document containing the terms "hello there" would match the first field,
but not the second, and so would not be selected for matching.
The number of passes the presearcher makes is configurable. More passes will improve the selected/matched ratio, but will take longer to index and will use more RAM.
A minimum weight can we set for terms to be chosen for the second and subsequent passes. This allows users to avoid indexing stopwords, for example.
TermFilteredPresearcher.BytesRefHashIterator, TermFilteredPresearcher.DocumentQueryBuilder
DEFAULT_WEIGHTOR
NO_FILTERING
Constructor and Description |
---|
MultipassTermFilteredPresearcher(int passes)
Construct a new MultipassTermFilteredPresearcher using
TermFilteredPresearcher.DEFAULT_WEIGHTOR |
MultipassTermFilteredPresearcher(int passes,
float minWeight,
TermWeightor weightor,
List<CustomQueryHandler> queryHandlers,
Set<String> filterFields)
Construct a new MultipassTermFilteredPresearcher
|
Modifier and Type | Method and Description |
---|---|
Document |
buildQueryDocument(QueryTree querytree)
Builds a
Document from the terms extracted from a query |
protected TermFilteredPresearcher.DocumentQueryBuilder |
getQueryBuilder()
Returns a
TermFilteredPresearcher.DocumentQueryBuilder for this presearcher |
buildQuery, collectTerms, indexQuery
public MultipassTermFilteredPresearcher(int passes, float minWeight, TermWeightor weightor, List<CustomQueryHandler> queryHandlers, Set<String> filterFields)
passes
- the number of times a query should be indexedminWeight
- the minimum weight a querytree should be advanced overweightor
- the TreeWeightor to usequeryHandlers
- a list of custom query handlersfilterFields
- a set of fields to use as filterspublic MultipassTermFilteredPresearcher(int passes)
TermFilteredPresearcher.DEFAULT_WEIGHTOR
Note that this will be constructed with a minimum advance weight of zero
passes
- the number of times a query should be indexedprotected TermFilteredPresearcher.DocumentQueryBuilder getQueryBuilder()
TermFilteredPresearcher
TermFilteredPresearcher.DocumentQueryBuilder
for this presearchergetQueryBuilder
in class TermFilteredPresearcher
public Document buildQueryDocument(QueryTree querytree)
TermFilteredPresearcher
Document
from the terms extracted from a querybuildQueryDocument
in class TermFilteredPresearcher
Copyright © 2000-2021 Apache Software Foundation. All Rights Reserved.