Class WeightedSpanTermExtractor

    • Constructor Detail

      • WeightedSpanTermExtractor

        public WeightedSpanTermExtractor()
      • WeightedSpanTermExtractor

        public WeightedSpanTermExtractor​(String defaultField)
    • Method Detail

      • extract

        protected void extract​(Query query,
                               float boost,
                               Map<String,​WeightedSpanTerm> terms)
                        throws IOException
        Fills a Map with WeightedSpanTerms using the terms from the supplied Query.
        Parameters:
        query - Query to extract Terms from
        terms - Map to place created WeightedSpanTerms in
        Throws:
        IOException - If there is a low-level I/O error
      • isQueryUnsupported

        protected boolean isQueryUnsupported​(Class<? extends Query> clazz)
      • extractWeightedSpanTerms

        protected void extractWeightedSpanTerms​(Map<String,​WeightedSpanTerm> terms,
                                                SpanQuery spanQuery,
                                                float boost)
                                         throws IOException
        Fills a Map with WeightedSpanTerms using the terms from the supplied SpanQuery.
        Parameters:
        terms - Map to place created WeightedSpanTerms in
        spanQuery - SpanQuery to extract Terms from
        Throws:
        IOException - If there is a low-level I/O error
      • extractWeightedTerms

        protected void extractWeightedTerms​(Map<String,​WeightedSpanTerm> terms,
                                            Query query,
                                            float boost)
                                     throws IOException
        Fills a Map with WeightedSpanTerms using the terms from the supplied Query.
        Parameters:
        terms - Map to place created WeightedSpanTerms in
        query - Query to extract Terms from
        Throws:
        IOException - If there is a low-level I/O error
      • fieldNameComparator

        protected boolean fieldNameComparator​(String fieldNameToCheck)
        Necessary to implement matches for queries against defaultField
      • getWeightedSpanTerms

        public Map<String,​WeightedSpanTerm> getWeightedSpanTerms​(Query query,
                                                                       float boost,
                                                                       TokenStream tokenStream)
                                                                throws IOException
        Creates a Map of WeightedSpanTerms from the given Query and TokenStream.

        Parameters:
        query - that caused hit
        tokenStream - of text to be highlighted
        Returns:
        Map containing WeightedSpanTerms
        Throws:
        IOException - If there is a low-level I/O error
      • getWeightedSpanTerms

        public Map<String,​WeightedSpanTerm> getWeightedSpanTerms​(Query query,
                                                                       float boost,
                                                                       TokenStream tokenStream,
                                                                       String fieldName)
                                                                throws IOException
        Creates a Map of WeightedSpanTerms from the given Query and TokenStream.

        Parameters:
        query - that caused hit
        tokenStream - of text to be highlighted
        fieldName - restricts Term's used based on field name
        Returns:
        Map containing WeightedSpanTerms
        Throws:
        IOException - If there is a low-level I/O error
      • getWeightedSpanTermsWithScores

        public Map<String,​WeightedSpanTerm> getWeightedSpanTermsWithScores​(Query query,
                                                                                 float boost,
                                                                                 TokenStream tokenStream,
                                                                                 String fieldName,
                                                                                 IndexReader reader)
                                                                          throws IOException
        Creates a Map of WeightedSpanTerms from the given Query and TokenStream. Uses a supplied IndexReader to properly weight terms (for gradient highlighting).

        Parameters:
        query - that caused hit
        tokenStream - of text to be highlighted
        fieldName - restricts Term's used based on field name
        reader - to use for scoring
        Returns:
        Map of WeightedSpanTerms with quasi tf/idf scores
        Throws:
        IOException - If there is a low-level I/O error
      • collectSpanQueryFields

        protected void collectSpanQueryFields​(SpanQuery spanQuery,
                                              Set<String> fieldNames)
      • mustRewriteQuery

        protected boolean mustRewriteQuery​(SpanQuery spanQuery)
      • getExpandMultiTermQuery

        public boolean getExpandMultiTermQuery()
      • setExpandMultiTermQuery

        public void setExpandMultiTermQuery​(boolean expandMultiTermQuery)
      • isUsePayloads

        public boolean isUsePayloads()
      • setUsePayloads

        public void setUsePayloads​(boolean usePayloads)
      • isCachedTokenStream

        public boolean isCachedTokenStream()
      • getTokenStream

        public TokenStream getTokenStream()
        Returns the tokenStream which may have been wrapped in a CachingTokenFilter. getWeightedSpanTerms* sets the tokenStream, so don't call this before.
      • setWrapIfNotCachingTokenFilter

        public void setWrapIfNotCachingTokenFilter​(boolean wrap)
        By default, TokenStreams that are not of the type CachingTokenFilter are wrapped in a CachingTokenFilter to ensure an efficient reset - if you are already using a different caching TokenStream impl and you don't want it to be wrapped, set this to false. This setting is ignored when a term vector based TokenStream is supplied, since it can be reset efficiently.
      • setMaxDocCharsToAnalyze

        protected final void setMaxDocCharsToAnalyze​(int maxDocCharsToAnalyze)
        A threshold of number of characters to analyze. When a TokenStream based on term vectors with offsets and positions are supplied, this setting does not apply.