org.apache.lucene.analysis.ar
Class ArabicStemmer

java.lang.Object
  extended by org.apache.lucene.analysis.ar.ArabicStemmer

public class ArabicStemmer
extends Object

Stemmer for Arabic.

Stemming is done in-place for efficiency, operating on a termbuffer.

Stemming is defined as:


Field Summary
static char ALEF
           
static char BEH
           
static char FEH
           
static char HEH
           
static char KAF
           
static char LAM
           
static char NOON
           
static char[][] prefixes
           
static char[][] suffixes
           
static char TEH
           
static char TEH_MARBUTA
           
static char WAW
           
static char YEH
           
 
Constructor Summary
ArabicStemmer()
           
 
Method Summary
protected  int delete(char[] s, int pos, int len)
          Delete a character in-place
protected  int deleteN(char[] s, int pos, int len, int nChars)
          Delete n characters in-place
 int stem(char[] s, int len)
          Stem an input buffer of Arabic text.
 int stemPrefix(char[] s, int len)
          Stem a prefix off an Arabic word.
 int stemSuffix(char[] s, int len)
          Stem suffix(es) off an Arabic word.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

ALEF

public static final char ALEF
See Also:
Constant Field Values

BEH

public static final char BEH
See Also:
Constant Field Values

TEH_MARBUTA

public static final char TEH_MARBUTA
See Also:
Constant Field Values

TEH

public static final char TEH
See Also:
Constant Field Values

FEH

public static final char FEH
See Also:
Constant Field Values

KAF

public static final char KAF
See Also:
Constant Field Values

LAM

public static final char LAM
See Also:
Constant Field Values

NOON

public static final char NOON
See Also:
Constant Field Values

HEH

public static final char HEH
See Also:
Constant Field Values

WAW

public static final char WAW
See Also:
Constant Field Values

YEH

public static final char YEH
See Also:
Constant Field Values

prefixes

public static final char[][] prefixes

suffixes

public static final char[][] suffixes
Constructor Detail

ArabicStemmer

public ArabicStemmer()
Method Detail

stem

public int stem(char[] s,
                int len)
Stem an input buffer of Arabic text.

Parameters:
s - input buffer
len - length of input buffer
Returns:
length of input buffer after normalization

stemPrefix

public int stemPrefix(char[] s,
                      int len)
Stem a prefix off an Arabic word.

Parameters:
s - input buffer
len - length of input buffer
Returns:
new length of input buffer after stemming.

stemSuffix

public int stemSuffix(char[] s,
                      int len)
Stem suffix(es) off an Arabic word.

Parameters:
s - input buffer
len - length of input buffer
Returns:
new length of input buffer after stemming

deleteN

protected int deleteN(char[] s,
                      int pos,
                      int len,
                      int nChars)
Delete n characters in-place

Parameters:
s - Input Buffer
pos - Position of character to delete
len - Length of input buffer
nChars - number of characters to delete
Returns:
length of input buffer after deletion

delete

protected int delete(char[] s,
                     int pos,
                     int len)
Delete a character in-place

Parameters:
s - Input Buffer
pos - Position of character to delete
len - length of input buffer
Returns:
length of input buffer after deletion


Copyright © 2000-2010 Apache Software Foundation. All Rights Reserved.