public class DictionaryBuilder extends Object
java -cp [lucene classpath] org.apache.lucene.analysis.ja.util.DictionaryBuilder \ ${inputDir} ${outputDir} ${encoding}
The input directory is expected to include unk.def, matrix.def, plus any number of .csv files, roughly following the conventions of IPADIC. JapaneseTokenizer uses dictionaries built with this tool. Note that the input files required by this build generally must be generated from a corpus of real text using tools that are not part of Lucene.
Modifier and Type | Class and Description |
---|---|
static class |
DictionaryBuilder.DictionaryFormat
Format of the dictionary.
|
Modifier and Type | Method and Description |
---|---|
static void |
build(DictionaryBuilder.DictionaryFormat format,
Path inputDir,
Path outputDir,
String encoding,
boolean normalizeEntry) |
static void |
main(String[] args) |
public static void build(DictionaryBuilder.DictionaryFormat format, Path inputDir, Path outputDir, String encoding, boolean normalizeEntry) throws IOException
IOException
public static void main(String[] args) throws IOException
IOException
Copyright © 2000-2020 Apache Software Foundation. All Rights Reserved.