Class UserDictionary
java.lang.Object
org.apache.lucene.analysis.ja.dict.UserDictionary
- All Implemented Interfaces:
Dictionary
Class for building a User Dictionary. This class allows for custom segmentation of phrases.
-
Field Summary
Modifier and TypeFieldDescriptionstatic final int
static final int
static final int
Fields inherited from interface org.apache.lucene.analysis.ja.dict.Dictionary
INTERNAL_SEPARATOR
-
Method Summary
Modifier and TypeMethodDescriptiongetBaseForm
(int wordId, char[] surface, int off, int len) Get base form of wordgetFST()
getInflectionForm
(int wordId) Get inflection form of tokensgetInflectionType
(int wordId) Get inflection type of tokensint
getLeftId
(int wordId) Get left id of specified wordgetPartOfSpeech
(int wordId) Get Part-Of-Speech of tokensgetPronunciation
(int wordId, char[] surface, int off, int len) Get pronunciation of tokensgetReading
(int wordId, char[] surface, int off, int len) Get reading of tokensint
getRightId
(int wordId) Get right id of specified wordint
getWordCost
(int wordId) Get word cost of specified wordint[][]
lookup
(char[] chars, int off, int len) Lookup words in textint[]
lookupSegmentation
(int phraseID) static UserDictionary
-
Field Details
-
WORD_COST
public static final int WORD_COST- See Also:
-
LEFT_ID
public static final int LEFT_ID- See Also:
-
RIGHT_ID
public static final int RIGHT_ID- See Also:
-
-
Method Details
-
open
- Throws:
IOException
-
lookup
Lookup words in text- Parameters:
chars
- textoff
- offset into textlen
- length of text- Returns:
- array of {wordId, position, length}
- Throws:
IOException
-
getFST
-
lookupSegmentation
public int[] lookupSegmentation(int phraseID) -
getLeftId
public int getLeftId(int wordId) Description copied from interface:Dictionary
Get left id of specified word- Specified by:
getLeftId
in interfaceDictionary
- Returns:
- left id
-
getRightId
public int getRightId(int wordId) Description copied from interface:Dictionary
Get right id of specified word- Specified by:
getRightId
in interfaceDictionary
- Returns:
- right id
-
getWordCost
public int getWordCost(int wordId) Description copied from interface:Dictionary
Get word cost of specified word- Specified by:
getWordCost
in interfaceDictionary
- Returns:
- word's cost
-
getReading
Description copied from interface:Dictionary
Get reading of tokens- Specified by:
getReading
in interfaceDictionary
- Parameters:
wordId
- word ID of token- Returns:
- Reading of the token
-
getPartOfSpeech
Description copied from interface:Dictionary
Get Part-Of-Speech of tokens- Specified by:
getPartOfSpeech
in interfaceDictionary
- Parameters:
wordId
- word ID of token- Returns:
- Part-Of-Speech of the token
-
getBaseForm
Description copied from interface:Dictionary
Get base form of word- Specified by:
getBaseForm
in interfaceDictionary
- Parameters:
wordId
- word ID of token- Returns:
- Base form (only different for inflected words, otherwise null)
-
getPronunciation
Description copied from interface:Dictionary
Get pronunciation of tokens- Specified by:
getPronunciation
in interfaceDictionary
- Parameters:
wordId
- word ID of token- Returns:
- Pronunciation of the token
-
getInflectionType
Description copied from interface:Dictionary
Get inflection type of tokens- Specified by:
getInflectionType
in interfaceDictionary
- Parameters:
wordId
- word ID of token- Returns:
- inflection type, or null
-
getInflectionForm
Description copied from interface:Dictionary
Get inflection form of tokens- Specified by:
getInflectionForm
in interfaceDictionary
- Parameters:
wordId
- word ID of token- Returns:
- inflection form, or null
-