Class JoinUtil

java.lang.Object
org.apache.lucene.search.join.JoinUtil

public final class JoinUtil extends Object
Utility for query time joining.
WARNING: This API is experimental and might change in incompatible ways in the next release.
  • Method Details

    • createJoinQuery

      public static Query createJoinQuery(String fromField, boolean multipleValuesPerDocument, String toField, Query fromQuery, IndexSearcher fromSearcher, ScoreMode scoreMode) throws IOException
      Method for query time joining.

      Execute the returned query with a IndexSearcher to retrieve all documents that have the same terms in the to field that match with documents matching the specified fromQuery and have the same terms in the from field.

      In the case a single document relates to more than one document the multipleValuesPerDocument option should be set to true. When the multipleValuesPerDocument is set to true only the the score from the first encountered join value originating from the 'from' side is mapped into the 'to' side. Even in the case when a second join value related to a specific document yields a higher score. Obviously this doesn't apply in the case that ScoreMode.None is used, since no scores are computed at all.

      Memory considerations: During joining all unique join values are kept in memory. On top of that when the scoreMode isn't set to ScoreMode.None a float value per unique join value is kept in memory for computing scores. When scoreMode is set to ScoreMode.Avg also an additional integer value is kept in memory per unique join value.

      Parameters:
      fromField - The from field to join from
      multipleValuesPerDocument - Whether the from field has multiple terms per document
      toField - The to field to join to
      fromQuery - The query to match documents on the from side
      fromSearcher - The searcher that executed the specified fromQuery
      scoreMode - Instructs how scores from the fromQuery are mapped to the returned query
      Returns:
      a Query instance that can be used to join documents based on the terms in the from and to field
      Throws:
      IOException - If I/O related errors occur
    • createJoinQuery

      public static Query createJoinQuery(String fromField, boolean multipleValuesPerDocument, String toField, Class<? extends Number> numericType, Query fromQuery, IndexSearcher fromSearcher, ScoreMode scoreMode) throws IOException
      Method for query time joining for numeric fields. It supports multi- and single- values longs, ints, floats and longs. All considerations from createJoinQuery(String, boolean, String, Query, IndexSearcher, ScoreMode) are applicable here too, though memory consumption might be higher.
      Parameters:
      fromField - The from field to join from
      multipleValuesPerDocument - Whether the from field has multiple terms per document when true fromField might be DocValuesType.SORTED_NUMERIC, otherwise fromField should be DocValuesType.NUMERIC
      toField - The to field to join to, should be IntPoint, LongPoint, FloatPoint or DoublePoint.
      numericType - either Integer, Long, Float or Double it should correspond to toField types
      fromQuery - The query to match documents on the from side
      fromSearcher - The searcher that executed the specified fromQuery
      scoreMode - Instructs how scores from the fromQuery are mapped to the returned query
      Returns:
      a Query instance that can be used to join documents based on the terms in the from and to field
      Throws:
      IOException - If I/O related errors occur
    • createJoinQuery

      public static Query createJoinQuery(String joinField, Query fromQuery, Query toQuery, IndexSearcher searcher, ScoreMode scoreMode, OrdinalMap ordinalMap) throws IOException
      Parameters:
      joinField - The SortedDocValues field containing the join values
      fromQuery - The query containing the actual user query. Also the fromQuery can only match "from" documents.
      toQuery - The query identifying all documents on the "to" side.
      searcher - The index searcher used to execute the from query
      scoreMode - Instructs how scores from the fromQuery are mapped to the returned query
      ordinalMap - The ordinal map constructed over the joinField. In case of a single segment index, no ordinal map needs to be provided.
      Returns:
      a Query instance that can be used to join documents based on the join field
      Throws:
      IOException - If I/O related errors occur
    • createJoinQuery

      public static Query createJoinQuery(String joinField, Query fromQuery, Query toQuery, IndexSearcher searcher, ScoreMode scoreMode, OrdinalMap ordinalMap, int min, int max) throws IOException
      A query time join using global ordinals over a dedicated join field.

      This join has certain restrictions and requirements: 1) A document can only refer to one other document. (but can be referred by one or more documents) 2) Documents on each side of the join must be distinguishable. Typically this can be done by adding an extra field that identifies the "from" and "to" side and then the fromQuery and toQuery must take the this into account. 3) There must be a single sorted doc values join field used by both the "from" and "to" documents. This join field should store the join values as UTF-8 strings. 4) An ordinal map must be provided that is created on top of the join field.

      Note: min and max filtering and the avg score mode will require this join to keep track of the number of times a document matches per join value. This will increase the per join cost in terms of execution time and memory.

      Parameters:
      joinField - The SortedDocValues field containing the join values
      fromQuery - The query containing the actual user query. Also the fromQuery can only match "from" documents.
      toQuery - The query identifying all documents on the "to" side.
      searcher - The index searcher used to execute the from query
      scoreMode - Instructs how scores from the fromQuery are mapped to the returned query
      ordinalMap - The ordinal map constructed over the joinField. In case of a single segment index, no ordinal map needs to be provided.
      min - Optionally the minimum number of "from" documents that are required to match for a "to" document to be a match. The min is inclusive. Setting min to 0 and max to Interger.MAX_VALUE disables the min and max "from" documents filtering
      max - Optionally the maximum number of "from" documents that are allowed to match for a "to" document to be a match. The max is inclusive. Setting min to 0 and max to Interger.MAX_VALUE disables the min and max "from" documents filtering
      Returns:
      a Query instance that can be used to join documents based on the join field
      Throws:
      IOException - If I/O related errors occur