|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.lucene.search.spell.WordBreakSpellChecker
public class WordBreakSpellChecker
A spell checker whose sole function is to offer suggestions by combining multiple terms into one word and/or breaking terms into multiple words.
Nested Class Summary | |
---|---|
static class |
WordBreakSpellChecker.BreakSuggestionSortMethod
Determines the order to list word break suggestions |
Field Summary | |
---|---|
static Term |
SEPARATOR_TERM
Term that can be used to prohibit adjacent terms from being combined |
Constructor Summary | |
---|---|
WordBreakSpellChecker()
Creates a new spellchecker with default configuration values |
Method Summary | |
---|---|
int |
getMaxChanges()
Returns the maximum number of changes to perform on the input |
int |
getMaxCombineWordLength()
Returns the maximum length of a combined suggestion |
int |
getMaxEvaluations()
Returns the maximum number of word combinations to evaluate. |
int |
getMinBreakWordLength()
Returns the minimum size of a broken word |
int |
getMinSuggestionFrequency()
Returns the minimum frequency a term must have to be part of a suggestion. |
void |
setMaxChanges(int maxChanges)
The maximum numbers of changes (word breaks or combinations) to make on the original term(s). |
void |
setMaxCombineWordLength(int maxCombineWordLength)
The maximum length of a suggestion made by combining 1 or more original terms. |
void |
setMaxEvaluations(int maxEvaluations)
The maximum number of word combinations to evaluate. |
void |
setMinBreakWordLength(int minBreakWordLength)
The minimum length to break words down to. |
void |
setMinSuggestionFrequency(int minSuggestionFrequency)
The minimum frequency a term must have to be included as part of a suggestion. |
SuggestWord[][] |
suggestWordBreaks(Term term,
int maxSuggestions,
IndexReader ir,
SuggestMode suggestMode,
WordBreakSpellChecker.BreakSuggestionSortMethod sortMethod)
Generate suggestions by breaking the passed-in term into multiple words. |
CombineSuggestion[] |
suggestWordCombinations(Term[] terms,
int maxSuggestions,
IndexReader ir,
SuggestMode suggestMode)
Generate suggestions by combining one or more of the passed-in terms into single words. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final Term SEPARATOR_TERM
Constructor Detail |
---|
public WordBreakSpellChecker()
setMaxChanges(int)
,
setMaxCombineWordLength(int)
,
setMaxEvaluations(int)
,
setMinBreakWordLength(int)
,
setMinSuggestionFrequency(int)
Method Detail |
---|
public SuggestWord[][] suggestWordBreaks(Term term, int maxSuggestions, IndexReader ir, SuggestMode suggestMode, WordBreakSpellChecker.BreakSuggestionSortMethod sortMethod) throws IOException
Generate suggestions by breaking the passed-in term into multiple words. The scores returned are equal to the number of word breaks needed so a lower score is generally preferred over a higher score.
suggestMode
- - default = SuggestMode.SUGGEST_WHEN_NOT_IN_INDEX
sortMethod
- - default =
WordBreakSpellChecker.BreakSuggestionSortMethod.NUM_CHANGES_THEN_MAX_FREQUENCY
IOException
- If there is a low-level I/O error.public CombineSuggestion[] suggestWordCombinations(Term[] terms, int maxSuggestions, IndexReader ir, SuggestMode suggestMode) throws IOException
Generate suggestions by combining one or more of the passed-in terms into
single words. The returned CombineSuggestion
contains both a
SuggestWord
and also an array detailing which passed-in terms were
involved in creating this combination. The scores returned are equal to the
number of word combinations needed, also one less than the length of the
array CombineSuggestion.originalTermIndexes
. Generally, a
suggestion with a lower score is preferred over a higher score.
To prevent two adjacent terms from being combined (for instance, if one is
mandatory and the other is prohibited), separate the two terms with
SEPARATOR_TERM
When suggestMode equals SuggestMode.SUGGEST_WHEN_NOT_IN_INDEX
, each
suggestion will include at least one term not in the index.
When suggestMode equals SuggestMode.SUGGEST_MORE_POPULAR
, each
suggestion will have the same, or better frequency than the most-popular
included term.
IOException
- If there is a low-level I/O error.public int getMinSuggestionFrequency()
setMinSuggestionFrequency(int)
public int getMaxCombineWordLength()
setMaxCombineWordLength(int)
public int getMinBreakWordLength()
setMinBreakWordLength(int)
public int getMaxChanges()
setMaxChanges(int)
public int getMaxEvaluations()
setMaxEvaluations(int)
public void setMinSuggestionFrequency(int minSuggestionFrequency)
The minimum frequency a term must have to be included as part of a
suggestion. Default=1 Not applicable when used with
SuggestMode.SUGGEST_MORE_POPULAR
getMinSuggestionFrequency()
public void setMaxCombineWordLength(int maxCombineWordLength)
The maximum length of a suggestion made by combining 1 or more original terms. Default=20
getMaxCombineWordLength()
public void setMinBreakWordLength(int minBreakWordLength)
The minimum length to break words down to. Default=1
getMinBreakWordLength()
public void setMaxChanges(int maxChanges)
The maximum numbers of changes (word breaks or combinations) to make on the original term(s). Default=1
getMaxChanges()
public void setMaxEvaluations(int maxEvaluations)
The maximum number of word combinations to evaluate. Default=1000. A higher value might improve result quality. A lower value might improve performance.
getMaxEvaluations()
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |