Class WordBreakSpellChecker

  • public class WordBreakSpellChecker
    extends Object
    A spell checker whose sole function is to offer suggestions by combining multiple terms into one word and/or breaking terms into multiple words.
    • Field Detail


        public static final Term SEPARATOR_TERM
        Term that can be used to prohibit adjacent terms from being combined
    • Method Detail

      • suggestWordCombinations

        public CombineSuggestion[] suggestWordCombinations​(Term[] terms,
                                                           int maxSuggestions,
                                                           IndexReader ir,
                                                           SuggestMode suggestMode)
                                                    throws IOException
        Generate suggestions by combining one or more of the passed-in terms into single words. The returned CombineSuggestion contains both a SuggestWord and also an array detailing which passed-in terms were involved in creating this combination. The scores returned are equal to the number of word combinations needed, also one less than the length of the array CombineSuggestion.originalTermIndexes. Generally, a suggestion with a lower score is preferred over a higher score.

        To prevent two adjacent terms from being combined (for instance, if one is mandatory and the other is prohibited), separate the two terms with SEPARATOR_TERM

        When suggestMode equals SuggestMode.SUGGEST_WHEN_NOT_IN_INDEX, each suggestion will include at least one term not in the index.

        When suggestMode equals SuggestMode.SUGGEST_MORE_POPULAR, each suggestion will have the same, or better frequency than the most-popular included term.

        an array of words generated by combining original terms
        IOException - If there is a low-level I/O error.
      • getMinSuggestionFrequency

        public int getMinSuggestionFrequency()
        Returns the minimum frequency a term must have to be part of a suggestion.
        See Also:
      • getMaxCombineWordLength

        public int getMaxCombineWordLength()
        Returns the maximum length of a combined suggestion
        See Also:
      • getMinBreakWordLength

        public int getMinBreakWordLength()
        Returns the minimum size of a broken word
        See Also:
      • getMaxChanges

        public int getMaxChanges()
        Returns the maximum number of changes to perform on the input
        See Also:
      • getMaxEvaluations

        public int getMaxEvaluations()
        Returns the maximum number of word combinations to evaluate.
        See Also:
      • setMaxCombineWordLength

        public void setMaxCombineWordLength​(int maxCombineWordLength)
        The maximum length of a suggestion made by combining 1 or more original terms. Default=20
        See Also:
      • setMinBreakWordLength

        public void setMinBreakWordLength​(int minBreakWordLength)
        The minimum length to break words down to. Default=1
        See Also:
      • setMaxChanges

        public void setMaxChanges​(int maxChanges)
        The maximum numbers of changes (word breaks or combinations) to make on the original term(s). Default=1
        See Also:
      • setMaxEvaluations

        public void setMaxEvaluations​(int maxEvaluations)
        The maximum number of word combinations to evaluate. Default=1000. A higher value might improve result quality. A lower value might improve performance.
        See Also: