public class WordBreakSpellChecker extends Object
A spell checker whose sole function is to offer suggestions by combining multiple terms into one word and/or breaking terms into multiple words.
Modifier and Type | Class and Description |
---|---|
static class |
WordBreakSpellChecker.BreakSuggestionSortMethod
Determines the order to list word break suggestions
|
Modifier and Type | Field and Description |
---|---|
static Term |
SEPARATOR_TERM
Term that can be used to prohibit adjacent terms from being combined
|
Constructor and Description |
---|
WordBreakSpellChecker()
Creates a new spellchecker with default configuration values
|
Modifier and Type | Method and Description |
---|---|
int |
getMaxChanges()
Returns the maximum number of changes to perform on the input
|
int |
getMaxCombineWordLength()
Returns the maximum length of a combined suggestion
|
int |
getMaxEvaluations()
Returns the maximum number of word combinations to evaluate.
|
int |
getMinBreakWordLength()
Returns the minimum size of a broken word
|
int |
getMinSuggestionFrequency()
Returns the minimum frequency a term must have
to be part of a suggestion.
|
void |
setMaxChanges(int maxChanges)
The maximum numbers of changes (word breaks or combinations) to make on the
original term(s).
|
void |
setMaxCombineWordLength(int maxCombineWordLength)
The maximum length of a suggestion made by combining 1 or more original
terms.
|
void |
setMaxEvaluations(int maxEvaluations)
The maximum number of word combinations to evaluate.
|
void |
setMinBreakWordLength(int minBreakWordLength)
The minimum length to break words down to.
|
void |
setMinSuggestionFrequency(int minSuggestionFrequency)
The minimum frequency a term must have to be included as part of a
suggestion.
|
SuggestWord[][] |
suggestWordBreaks(Term term,
int maxSuggestions,
IndexReader ir,
SuggestMode suggestMode,
WordBreakSpellChecker.BreakSuggestionSortMethod sortMethod)
Generate suggestions by breaking the passed-in term into multiple words.
|
CombineSuggestion[] |
suggestWordCombinations(Term[] terms,
int maxSuggestions,
IndexReader ir,
SuggestMode suggestMode)
Generate suggestions by combining one or more of the passed-in terms into
single words.
|
public static final Term SEPARATOR_TERM
public WordBreakSpellChecker()
public SuggestWord[][] suggestWordBreaks(Term term, int maxSuggestions, IndexReader ir, SuggestMode suggestMode, WordBreakSpellChecker.BreakSuggestionSortMethod sortMethod) throws IOException
Generate suggestions by breaking the passed-in term into multiple words. The scores returned are equal to the number of word breaks needed so a lower score is generally preferred over a higher score.
suggestMode
- - default = SuggestMode.SUGGEST_WHEN_NOT_IN_INDEX
sortMethod
- - default =
WordBreakSpellChecker.BreakSuggestionSortMethod.NUM_CHANGES_THEN_MAX_FREQUENCY
IOException
- If there is a low-level I/O error.public CombineSuggestion[] suggestWordCombinations(Term[] terms, int maxSuggestions, IndexReader ir, SuggestMode suggestMode) throws IOException
Generate suggestions by combining one or more of the passed-in terms into
single words. The returned CombineSuggestion
contains both a
SuggestWord
and also an array detailing which passed-in terms were
involved in creating this combination. The scores returned are equal to the
number of word combinations needed, also one less than the length of the
array CombineSuggestion.originalTermIndexes
. Generally, a
suggestion with a lower score is preferred over a higher score.
To prevent two adjacent terms from being combined (for instance, if one is
mandatory and the other is prohibited), separate the two terms with
SEPARATOR_TERM
When suggestMode equals SuggestMode.SUGGEST_WHEN_NOT_IN_INDEX
, each
suggestion will include at least one term not in the index.
When suggestMode equals SuggestMode.SUGGEST_MORE_POPULAR
, each
suggestion will have the same, or better frequency than the most-popular
included term.
IOException
- If there is a low-level I/O error.public int getMinSuggestionFrequency()
setMinSuggestionFrequency(int)
public int getMaxCombineWordLength()
setMaxCombineWordLength(int)
public int getMinBreakWordLength()
setMinBreakWordLength(int)
public int getMaxChanges()
setMaxChanges(int)
public int getMaxEvaluations()
setMaxEvaluations(int)
public void setMinSuggestionFrequency(int minSuggestionFrequency)
The minimum frequency a term must have to be included as part of a
suggestion. Default=1 Not applicable when used with
SuggestMode.SUGGEST_MORE_POPULAR
getMinSuggestionFrequency()
public void setMaxCombineWordLength(int maxCombineWordLength)
The maximum length of a suggestion made by combining 1 or more original terms. Default=20
getMaxCombineWordLength()
public void setMinBreakWordLength(int minBreakWordLength)
The minimum length to break words down to. Default=1
getMinBreakWordLength()
public void setMaxChanges(int maxChanges)
The maximum numbers of changes (word breaks or combinations) to make on the original term(s). Default=1
getMaxChanges()
public void setMaxEvaluations(int maxEvaluations)
The maximum number of word combinations to evaluate. Default=1000. A higher value might improve result quality. A lower value might improve performance.
getMaxEvaluations()
Copyright © 2000-2015 Apache Software Foundation. All Rights Reserved.