Class IndicNormalizer

java.lang.Object
org.apache.lucene.analysis.in.IndicNormalizer

public class IndicNormalizer extends Object
Normalizes the Unicode representation of text in Indian languages.

Follows guidelines from Unicode 5.2, chapter 6, South Asian Scripts I and graphical decompositions from http://ldc.upenn.edu/myl/IndianScriptsUnicode.html

  • Constructor Details

    • IndicNormalizer

      public IndicNormalizer()
  • Method Details

    • normalize

      public int normalize(char[] text, int len)
      Normalizes input text, and returns the new length. The length will always be less than or equal to the existing length.
      Parameters:
      text - input text
      len - valid length
      Returns:
      normalized length