org.apache.lucene.analysis.ar
Class ArabicNormalizer

java.lang.Object
  extended by org.apache.lucene.analysis.ar.ArabicNormalizer

public class ArabicNormalizer
extends Object

Normalizer for Arabic.

Normalization is done in-place for efficiency, operating on a termbuffer.

Normalization is defined as:


Field Summary
static char ALEF
           
static char ALEF_HAMZA_ABOVE
           
static char ALEF_HAMZA_BELOW
           
static char ALEF_MADDA
           
static char DAMMA
           
static char DAMMATAN
           
static char DOTLESS_YEH
           
static char FATHA
           
static char FATHATAN
           
static char HEH
           
static char KASRA
           
static char KASRATAN
           
static char SHADDA
           
static char SUKUN
           
static char TATWEEL
           
static char TEH_MARBUTA
           
static char YEH
           
 
Constructor Summary
ArabicNormalizer()
           
 
Method Summary
protected  int delete(char[] s, int pos, int len)
          Delete a character in-place
 int normalize(char[] s, int len)
          Normalize an input buffer of Arabic text
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

ALEF

public static final char ALEF
See Also:
Constant Field Values

ALEF_MADDA

public static final char ALEF_MADDA
See Also:
Constant Field Values

ALEF_HAMZA_ABOVE

public static final char ALEF_HAMZA_ABOVE
See Also:
Constant Field Values

ALEF_HAMZA_BELOW

public static final char ALEF_HAMZA_BELOW
See Also:
Constant Field Values

YEH

public static final char YEH
See Also:
Constant Field Values

DOTLESS_YEH

public static final char DOTLESS_YEH
See Also:
Constant Field Values

TEH_MARBUTA

public static final char TEH_MARBUTA
See Also:
Constant Field Values

HEH

public static final char HEH
See Also:
Constant Field Values

TATWEEL

public static final char TATWEEL
See Also:
Constant Field Values

FATHATAN

public static final char FATHATAN
See Also:
Constant Field Values

DAMMATAN

public static final char DAMMATAN
See Also:
Constant Field Values

KASRATAN

public static final char KASRATAN
See Also:
Constant Field Values

FATHA

public static final char FATHA
See Also:
Constant Field Values

DAMMA

public static final char DAMMA
See Also:
Constant Field Values

KASRA

public static final char KASRA
See Also:
Constant Field Values

SHADDA

public static final char SHADDA
See Also:
Constant Field Values

SUKUN

public static final char SUKUN
See Also:
Constant Field Values
Constructor Detail

ArabicNormalizer

public ArabicNormalizer()
Method Detail

normalize

public int normalize(char[] s,
                     int len)
Normalize an input buffer of Arabic text

Parameters:
s - input buffer
len - length of input buffer
Returns:
length of input buffer after normalization

delete

protected int delete(char[] s,
                     int pos,
                     int len)
Delete a character in-place

Parameters:
s - Input Buffer
pos - Position of character to delete
len - length of input buffer
Returns:
length of input buffer after deletion


Copyright © 2000-2010 Apache Software Foundation. All Rights Reserved.