Class ArabicNormalizer

java.lang.Object
org.apache.lucene.analysis.ar.ArabicNormalizer

public class ArabicNormalizer extends Object
Normalizer for Arabic.

Normalization is done in-place for efficiency, operating on a termbuffer.

Normalization is defined as:

  • Normalization of hamza with alef seat to a bare alef.
  • Normalization of teh marbuta to heh
  • Normalization of dotless yeh (alef maksura) to yeh.
  • Removal of Arabic diacritics (the harakat)
  • Removal of tatweel (stretching character).