Package org.apache.lucene.analysis.in
Class IndicNormalizer
java.lang.Object
org.apache.lucene.analysis.in.IndicNormalizer
Normalizes the Unicode representation of text in Indian languages.
Follows guidelines from Unicode 5.2, chapter 6, South Asian Scripts I and graphical decompositions from http://ldc.upenn.edu/myl/IndianScriptsUnicode.html
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionint
normalize
(char[] text, int len) Normalizes input text, and returns the new length.
-
Constructor Details
-
IndicNormalizer
public IndicNormalizer()
-
-
Method Details
-
normalize
public int normalize(char[] text, int len) Normalizes input text, and returns the new length. The length will always be less than or equal to the existing length.- Parameters:
text
- input textlen
- valid length- Returns:
- normalized length
-