Class ShingleAnalyzerWrapper

    • Constructor Detail

      • ShingleAnalyzerWrapper

        public ShingleAnalyzerWrapper​(Analyzer defaultAnalyzer)
      • ShingleAnalyzerWrapper

        public ShingleAnalyzerWrapper​(Analyzer defaultAnalyzer,
                                      int maxShingleSize)
      • ShingleAnalyzerWrapper

        public ShingleAnalyzerWrapper​(Analyzer defaultAnalyzer,
                                      int minShingleSize,
                                      int maxShingleSize)
      • ShingleAnalyzerWrapper

        public ShingleAnalyzerWrapper​(Analyzer delegate,
                                      int minShingleSize,
                                      int maxShingleSize,
                                      String tokenSeparator,
                                      boolean outputUnigrams,
                                      boolean outputUnigramsIfNoShingles,
                                      String fillerToken)
        Creates a new ShingleAnalyzerWrapper
        Parameters:
        delegate - Analyzer whose TokenStream is to be filtered
        minShingleSize - Min shingle (token ngram) size
        maxShingleSize - Max shingle size
        tokenSeparator - Used to separate input stream tokens in output shingles
        outputUnigrams - Whether or not the filter shall pass the original tokens to the output stream
        outputUnigramsIfNoShingles - Overrides the behavior of outputUnigrams==false for those times when no shingles are available (because there are fewer than minShingleSize tokens in the input stream)? Note that if outputUnigrams==true, then unigrams are always output, regardless of whether any shingles are available.
        fillerToken - filler token to use when positionIncrement is more than 1
      • ShingleAnalyzerWrapper

        public ShingleAnalyzerWrapper()
      • ShingleAnalyzerWrapper

        public ShingleAnalyzerWrapper​(int minShingleSize,
                                      int maxShingleSize)
    • Method Detail

      • getMaxShingleSize

        public int getMaxShingleSize()
        The max shingle (token ngram) size
        Returns:
        The max shingle (token ngram) size
      • getMinShingleSize

        public int getMinShingleSize()
        The min shingle (token ngram) size
        Returns:
        The min shingle (token ngram) size
      • getTokenSeparator

        public String getTokenSeparator()
      • isOutputUnigrams

        public boolean isOutputUnigrams()
      • isOutputUnigramsIfNoShingles

        public boolean isOutputUnigramsIfNoShingles()
      • getFillerToken

        public String getFillerToken()