Class BinaryDictionary
java.lang.Object
org.apache.lucene.analysis.ja.dict.BinaryDictionary
- All Implemented Interfaces:
Dictionary
- Direct Known Subclasses:
TokenInfoDictionary,UnknownDictionary
Base class for a binary-encoded in-memory dictionary.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic enumDeprecated, for removal: This API element is subject to removal in a future version. -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final Stringstatic final Stringstatic final intflag that the entry has baseform data.static final intflag that the entry has pronunciation data.static final intflag that the entry has reading data.static final Stringstatic final Stringstatic final Stringstatic final Stringstatic final intFields inherited from interface org.apache.lucene.analysis.ja.dict.Dictionary
INTERNAL_SEPARATOR -
Constructor Summary
ConstructorsModifierConstructorDescriptionprotectedBinaryDictionary(IOSupplier<InputStream> targetMapResource, IOSupplier<InputStream> posResource, IOSupplier<InputStream> dictResource) -
Method Summary
Modifier and TypeMethodDescriptiongetBaseForm(int wordId, char[] surfaceForm, int off, int len) Get base form of wordgetInflectionForm(int wordId) Get inflection form of tokensgetInflectionType(int wordId) Get inflection type of tokensintgetLeftId(int wordId) Get left id of specified wordgetPartOfSpeech(int wordId) Get Part-Of-Speech of tokensgetPronunciation(int wordId, char[] surface, int off, int len) Get pronunciation of tokensgetReading(int wordId, char[] surface, int off, int len) Get reading of tokensstatic final InputStreamgetResource(BinaryDictionary.ResourceScheme scheme, String path) Deprecated, for removal: This API element is subject to removal in a future version.intgetRightId(int wordId) Get right id of specified wordintgetWordCost(int wordId) Get word cost of specified wordvoidlookupWordIds(int sourceId, IntsRef ref)
-
Field Details
-
DICT_FILENAME_SUFFIX
- See Also:
-
TARGETMAP_FILENAME_SUFFIX
- See Also:
-
POSDICT_FILENAME_SUFFIX
- See Also:
-
DICT_HEADER
- See Also:
-
TARGETMAP_HEADER
- See Also:
-
POSDICT_HEADER
- See Also:
-
VERSION
public static final int VERSION- See Also:
-
HAS_BASEFORM
public static final int HAS_BASEFORMflag that the entry has baseform data. otherwise it's not inflected (same as surface form)- See Also:
-
HAS_READING
public static final int HAS_READINGflag that the entry has reading data. otherwise reading is surface form converted to katakana- See Also:
-
HAS_PRONUNCIATION
public static final int HAS_PRONUNCIATIONflag that the entry has pronunciation data. otherwise pronunciation is the reading- See Also:
-
-
Constructor Details
-
BinaryDictionary
protected BinaryDictionary(IOSupplier<InputStream> targetMapResource, IOSupplier<InputStream> posResource, IOSupplier<InputStream> dictResource) throws IOException - Throws:
IOException
-
-
Method Details
-
getResource
@Deprecated(forRemoval=true, since="9.1") public static final InputStream getResource(BinaryDictionary.ResourceScheme scheme, String path) throws IOException Deprecated, for removal: This API element is subject to removal in a future version.- Throws:
IOException
-
lookupWordIds
-
getLeftId
public int getLeftId(int wordId) Description copied from interface:DictionaryGet left id of specified word- Specified by:
getLeftIdin interfaceDictionary- Returns:
- left id
-
getRightId
public int getRightId(int wordId) Description copied from interface:DictionaryGet right id of specified word- Specified by:
getRightIdin interfaceDictionary- Returns:
- right id
-
getWordCost
public int getWordCost(int wordId) Description copied from interface:DictionaryGet word cost of specified word- Specified by:
getWordCostin interfaceDictionary- Returns:
- word's cost
-
getBaseForm
Description copied from interface:DictionaryGet base form of word- Specified by:
getBaseFormin interfaceDictionary- Parameters:
wordId- word ID of token- Returns:
- Base form (only different for inflected words, otherwise null)
-
getReading
Description copied from interface:DictionaryGet reading of tokens- Specified by:
getReadingin interfaceDictionary- Parameters:
wordId- word ID of token- Returns:
- Reading of the token
-
getPartOfSpeech
Description copied from interface:DictionaryGet Part-Of-Speech of tokens- Specified by:
getPartOfSpeechin interfaceDictionary- Parameters:
wordId- word ID of token- Returns:
- Part-Of-Speech of the token
-
getPronunciation
Description copied from interface:DictionaryGet pronunciation of tokens- Specified by:
getPronunciationin interfaceDictionary- Parameters:
wordId- word ID of token- Returns:
- Pronunciation of the token
-
getInflectionType
Description copied from interface:DictionaryGet inflection type of tokens- Specified by:
getInflectionTypein interfaceDictionary- Parameters:
wordId- word ID of token- Returns:
- inflection type, or null
-
getInflectionForm
Description copied from interface:DictionaryGet inflection form of tokens- Specified by:
getInflectionFormin interfaceDictionary- Parameters:
wordId- word ID of token- Returns:
- inflection form, or null
-