Package org.tartarus.snowball
Class SnowballProgram
- java.lang.Object
-
- org.tartarus.snowball.SnowballProgram
-
- Direct Known Subclasses:
ArabicStemmer
,ArmenianStemmer
,BasqueStemmer
,CatalanStemmer
,DanishStemmer
,DutchStemmer
,EnglishStemmer
,EstonianStemmer
,FinnishStemmer
,FrenchStemmer
,German2Stemmer
,GermanStemmer
,HungarianStemmer
,IrishStemmer
,ItalianStemmer
,KpStemmer
,LithuanianStemmer
,LovinsStemmer
,NorwegianStemmer
,PorterStemmer
,PortugueseStemmer
,RomanianStemmer
,RussianStemmer
,SpanishStemmer
,SwedishStemmer
,TurkishStemmer
public abstract class SnowballProgram extends Object
This is the rev 502 of the Snowball SVN trunk, now located at GitHub, but modified:- made abstract and introduced abstract method stem to avoid expensive reflection in filter class.
- refactored StringBuffers to StringBuilder
- uses char[] as buffer instead of StringBuffer/StringBuilder
- eq_s,eq_s_b,insert,replace_s take CharSequence like eq_v and eq_v_b
- use MethodHandles and fix method visibility bug.
-
-
Field Summary
Fields Modifier and Type Field Description protected int
bra
protected int
cursor
protected int
ket
protected int
limit
protected int
limit_backward
-
Constructor Summary
Constructors Modifier Constructor Description protected
SnowballProgram()
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected StringBuilder
assign_to(StringBuilder s)
protected void
copy_from(SnowballProgram other)
protected boolean
eq_s(int s_size, CharSequence s)
protected boolean
eq_s_b(int s_size, CharSequence s)
protected boolean
eq_v(CharSequence s)
protected boolean
eq_v_b(CharSequence s)
protected int
find_among(Among[] v, int v_size)
protected int
find_among_b(Among[] v, int v_size)
String
getCurrent()
Get the current string.char[]
getCurrentBuffer()
Get the current buffer containing the stem.int
getCurrentBufferLength()
Get the valid length of the character array ingetCurrentBuffer()
.protected boolean
in_grouping(char[] s, int min, int max)
protected boolean
in_grouping_b(char[] s, int min, int max)
protected boolean
in_range(int min, int max)
protected boolean
in_range_b(int min, int max)
protected void
insert(int c_bra, int c_ket, CharSequence s)
protected boolean
out_grouping(char[] s, int min, int max)
protected boolean
out_grouping_b(char[] s, int min, int max)
protected boolean
out_range(int min, int max)
protected boolean
out_range_b(int min, int max)
protected int
replace_s(int c_bra, int c_ket, CharSequence s)
void
setCurrent(char[] text, int length)
Set the current string.void
setCurrent(String value)
Set the current string.protected void
slice_check()
protected void
slice_del()
protected void
slice_from(CharSequence s)
protected StringBuilder
slice_to(StringBuilder s)
abstract boolean
stem()
-
-
-
Method Detail
-
stem
public abstract boolean stem()
-
setCurrent
public void setCurrent(String value)
Set the current string.
-
getCurrent
public String getCurrent()
Get the current string.
-
setCurrent
public void setCurrent(char[] text, int length)
Set the current string.- Parameters:
text
- character array containing inputlength
- valid length of text.
-
getCurrentBuffer
public char[] getCurrentBuffer()
Get the current buffer containing the stem.NOTE: this may be a reference to a different character array than the one originally provided with setCurrent, in the exceptional case that stemming produced a longer intermediate or result string.
It is necessary to use
getCurrentBufferLength()
to determine the valid length of the returned buffer. For example, many words are stemmed simply by subtracting from the length to remove suffixes.- See Also:
getCurrentBufferLength()
-
getCurrentBufferLength
public int getCurrentBufferLength()
Get the valid length of the character array ingetCurrentBuffer()
.- Returns:
- valid length of the array.
-
copy_from
protected void copy_from(SnowballProgram other)
-
in_grouping
protected boolean in_grouping(char[] s, int min, int max)
-
in_grouping_b
protected boolean in_grouping_b(char[] s, int min, int max)
-
out_grouping
protected boolean out_grouping(char[] s, int min, int max)
-
out_grouping_b
protected boolean out_grouping_b(char[] s, int min, int max)
-
in_range
protected boolean in_range(int min, int max)
-
in_range_b
protected boolean in_range_b(int min, int max)
-
out_range
protected boolean out_range(int min, int max)
-
out_range_b
protected boolean out_range_b(int min, int max)
-
eq_s
protected boolean eq_s(int s_size, CharSequence s)
-
eq_s_b
protected boolean eq_s_b(int s_size, CharSequence s)
-
eq_v
protected boolean eq_v(CharSequence s)
-
eq_v_b
protected boolean eq_v_b(CharSequence s)
-
find_among
protected int find_among(Among[] v, int v_size)
-
find_among_b
protected int find_among_b(Among[] v, int v_size)
-
replace_s
protected int replace_s(int c_bra, int c_ket, CharSequence s)
-
slice_check
protected void slice_check()
-
slice_from
protected void slice_from(CharSequence s)
-
slice_del
protected void slice_del()
-
insert
protected void insert(int c_bra, int c_ket, CharSequence s)
-
slice_to
protected StringBuilder slice_to(StringBuilder s)
-
assign_to
protected StringBuilder assign_to(StringBuilder s)
-
-