Utility for query time joining using
Expert: creates a filter accepting all documents containing the provided term, disregarding deleted documents.
Collects parent document hits for a Query containing one more more BlockJoinQuery clauses, sorted by the specified parent Sort.
This query requires that you index children and parent docs as a single block, using the
How to aggregate multiple child hit scores into a single parent score.
This modules support index-time and query-time joins.
The index-time joining support joins while searching, where joined
documents are indexed as a single document block using
IndexWriter.addDocuments(java.util.Collection<org.apache.lucene.document.Document>). This is useful for any normalized content (XML documents or database tables). In database terms, all rows for all
joined tables matching a single row of the primary table must be
indexed as a single document block, with the parent document
being last in the group.
When you index in this way, the documents in your index are divided
into parent documents (the last document of each block) and child
documents (all others). You provide a
Filter that identifies the
parent documents, as Lucene does not currently record any information
about doc blocks.
At search time, use
ToParentBlockJoinQuery to remap/join
matches from any child
Query (ie, a
query that matches only child documents) up to the parent document
resulting query can then be used as a clause in any query that
If you only care about the parent documents matching the query, you
can use any collector to collect the parent hits, but if you'd also
like to see which child documents match for each parent document,
ToParentBlockJoinCollector to collect the hits. Once the
search is done, you retrieve a
TopGroups instance from the
ToParentBlockJoinCollector.getTopGroups(org.apache.lucene.search.join.ToParentBlockJoinQuery, org.apache.lucene.search.Sort, int, int, int, boolean) method.
To map/join in the opposite direction, use
ToChildBlockJoinQuery. This wraps
any query matching parent documents, creating the joined query
matching only child documents.
The query time joining is index term based and implemented as two pass search. The first pass collects all the terms from a fromField that match the fromQuery. The second pass returns all documents that have matching terms in a toField to the terms collected in the first pass.
Query time joining has the following input:
fromField: The from field to join from.
fromQuery: The query executed to collect the from terms. This is usually the user specified query.
toField: The to field to join to
Basically the query-time joining is accessible from one static method. The user of this method supplies the method
with the described input and a
IndexSearcher where the from terms need to be collected from. The returned
query can be executed with the same
IndexSearcher, but also with another
Example usage of the
JoinUtil.createJoinQuery(String, String, org.apache.lucene.search.Query, org.apache.lucene.search.IndexSearcher) :
String fromField = "from"; // Name of the from field String toField = "to"; // Name of the to field Query fromQuery = new TermQuery(new Term("content", searchTerm)); // Query executed to collect from values to join to the to values Query joinQuery = JoinUtil.createJoinQuery(fromField, toField, fromQuery, fromSearcher); TopDocs topDocs = toSearcher.search(joinQuery, 10); // Note: toSearcher can be the same as the fromSearcher // Render topDocs...