Class JapaneseCompletionFilter

  • All Implemented Interfaces:
    Closeable, AutoCloseable

    public final class JapaneseCompletionFilter
    extends TokenFilter
    A TokenFilter that adds Japanese romanized tokens to the term attribute. Also, this keeps original tokens (surface forms). Main usage of this filter is Query Auto-Completion.

    Supported romanization systems: (modified) Hepburn-shiki, Kunrei-shiki (Nihon-shiki) and Wāpuro shiki.

    This does not strictly comply with the romanization systems listed above, but tries to cover possible keystroke supported by various Input Methods. e.g.: Circumflex / Macron representing Chōonpu (長音符) are not supported.

    The romanization behaviour changes according to its JapaneseCompletionFilter.Mode. The default mode is JapaneseCompletionFilter.Mode.INDEX.

    Note: This filter must be applied AFTER half-width and full-width normalization. Please ensure that a width normalizer such as CJKWidthCharFilter or CJKWidthFilter is included in your analysis chain. IF THE WIDTH NORMALIZATION IS NOT PERFORMED, THIS DOES NOT WORK AS EXPECTED. See also: JapaneseCompletionAnalyzer.