Class BinaryDictionary
java.lang.Object
org.apache.lucene.analysis.ja.dict.BinaryDictionary
- All Implemented Interfaces:
Dictionary
- Direct Known Subclasses:
TokenInfoDictionary
,UnknownDictionary
Base class for a binary-encoded in-memory dictionary.
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic enum
Deprecated, for removal: This API element is subject to removal in a future version. -
Field Summary
Modifier and TypeFieldDescriptionstatic final String
static final String
static final int
flag that the entry has baseform data.static final int
flag that the entry has pronunciation data.static final int
flag that the entry has reading data.static final String
static final String
static final String
static final String
static final int
Fields inherited from interface org.apache.lucene.analysis.ja.dict.Dictionary
INTERNAL_SEPARATOR
-
Constructor Summary
ModifierConstructorDescriptionprotected
BinaryDictionary
(IOSupplier<InputStream> targetMapResource, IOSupplier<InputStream> posResource, IOSupplier<InputStream> dictResource) -
Method Summary
Modifier and TypeMethodDescriptiongetBaseForm
(int wordId, char[] surfaceForm, int off, int len) Get base form of wordgetInflectionForm
(int wordId) Get inflection form of tokensgetInflectionType
(int wordId) Get inflection type of tokensint
getLeftId
(int wordId) Get left id of specified wordgetPartOfSpeech
(int wordId) Get Part-Of-Speech of tokensgetPronunciation
(int wordId, char[] surface, int off, int len) Get pronunciation of tokensgetReading
(int wordId, char[] surface, int off, int len) Get reading of tokensstatic final InputStream
getResource
(BinaryDictionary.ResourceScheme scheme, String path) Deprecated, for removal: This API element is subject to removal in a future version.int
getRightId
(int wordId) Get right id of specified wordint
getWordCost
(int wordId) Get word cost of specified wordvoid
lookupWordIds
(int sourceId, IntsRef ref)
-
Field Details
-
DICT_FILENAME_SUFFIX
- See Also:
-
TARGETMAP_FILENAME_SUFFIX
- See Also:
-
POSDICT_FILENAME_SUFFIX
- See Also:
-
DICT_HEADER
- See Also:
-
TARGETMAP_HEADER
- See Also:
-
POSDICT_HEADER
- See Also:
-
VERSION
public static final int VERSION- See Also:
-
HAS_BASEFORM
public static final int HAS_BASEFORMflag that the entry has baseform data. otherwise it's not inflected (same as surface form)- See Also:
-
HAS_READING
public static final int HAS_READINGflag that the entry has reading data. otherwise reading is surface form converted to katakana- See Also:
-
HAS_PRONUNCIATION
public static final int HAS_PRONUNCIATIONflag that the entry has pronunciation data. otherwise pronunciation is the reading- See Also:
-
-
Constructor Details
-
BinaryDictionary
protected BinaryDictionary(IOSupplier<InputStream> targetMapResource, IOSupplier<InputStream> posResource, IOSupplier<InputStream> dictResource) throws IOException - Throws:
IOException
-
-
Method Details
-
getResource
@Deprecated(forRemoval=true, since="9.1") public static final InputStream getResource(BinaryDictionary.ResourceScheme scheme, String path) throws IOException Deprecated, for removal: This API element is subject to removal in a future version.- Throws:
IOException
-
lookupWordIds
-
getLeftId
public int getLeftId(int wordId) Description copied from interface:Dictionary
Get left id of specified word- Specified by:
getLeftId
in interfaceDictionary
- Returns:
- left id
-
getRightId
public int getRightId(int wordId) Description copied from interface:Dictionary
Get right id of specified word- Specified by:
getRightId
in interfaceDictionary
- Returns:
- right id
-
getWordCost
public int getWordCost(int wordId) Description copied from interface:Dictionary
Get word cost of specified word- Specified by:
getWordCost
in interfaceDictionary
- Returns:
- word's cost
-
getBaseForm
Description copied from interface:Dictionary
Get base form of word- Specified by:
getBaseForm
in interfaceDictionary
- Parameters:
wordId
- word ID of token- Returns:
- Base form (only different for inflected words, otherwise null)
-
getReading
Description copied from interface:Dictionary
Get reading of tokens- Specified by:
getReading
in interfaceDictionary
- Parameters:
wordId
- word ID of token- Returns:
- Reading of the token
-
getPartOfSpeech
Description copied from interface:Dictionary
Get Part-Of-Speech of tokens- Specified by:
getPartOfSpeech
in interfaceDictionary
- Parameters:
wordId
- word ID of token- Returns:
- Part-Of-Speech of the token
-
getPronunciation
Description copied from interface:Dictionary
Get pronunciation of tokens- Specified by:
getPronunciation
in interfaceDictionary
- Parameters:
wordId
- word ID of token- Returns:
- Pronunciation of the token
-
getInflectionType
Description copied from interface:Dictionary
Get inflection type of tokens- Specified by:
getInflectionType
in interfaceDictionary
- Parameters:
wordId
- word ID of token- Returns:
- inflection type, or null
-
getInflectionForm
Description copied from interface:Dictionary
Get inflection form of tokens- Specified by:
getInflectionForm
in interfaceDictionary
- Parameters:
wordId
- word ID of token- Returns:
- inflection form, or null
-