org.apache.lucene.analysis.nl
Class DutchStemFilter

java.lang.Object
  extended by org.apache.lucene.util.AttributeSource
      extended by org.apache.lucene.analysis.TokenStream
          extended by org.apache.lucene.analysis.TokenFilter
              extended by org.apache.lucene.analysis.nl.DutchStemFilter
All Implemented Interfaces:
Closeable

public final class DutchStemFilter
extends org.apache.lucene.analysis.TokenFilter

A TokenFilter that stems Dutch words.

It supports a table of words that should not be stemmed at all. The stemmer used can be changed at runtime after the filter object is created (as long as it is a DutchStemmer).

NOTE: This stemmer does not implement the Snowball algorithm correctly, specifically doubled consonants. It is recommended that you consider using the "Dutch" stemmer in the snowball package instead. This stemmer will likely be deprecated in a future release.


Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource
org.apache.lucene.util.AttributeSource.AttributeFactory, org.apache.lucene.util.AttributeSource.State
 
Field Summary
 
Fields inherited from class org.apache.lucene.analysis.TokenFilter
input
 
Constructor Summary
DutchStemFilter(org.apache.lucene.analysis.TokenStream _in)
           
DutchStemFilter(org.apache.lucene.analysis.TokenStream _in, Set exclusiontable)
          Builds a DutchStemFilter that uses an exclusion table.
DutchStemFilter(org.apache.lucene.analysis.TokenStream _in, Set exclusiontable, Map stemdictionary)
           
 
Method Summary
 boolean incrementToken()
          Returns the next token in the stream, or null at EOS
 void setExclusionTable(HashSet exclusiontable)
          Set an alternative exclusion list for this filter.
 void setStemDictionary(HashMap dict)
          Set dictionary for stemming, this dictionary overrules the algorithm, so you can correct for a particular unwanted word-stem pair.
 void setStemmer(DutchStemmer stemmer)
          Set a alternative/custom DutchStemmer for this filter.
 
Methods inherited from class org.apache.lucene.analysis.TokenFilter
close, end, reset
 
Methods inherited from class org.apache.lucene.util.AttributeSource
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, restoreState, toString
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Constructor Detail

DutchStemFilter

public DutchStemFilter(org.apache.lucene.analysis.TokenStream _in)

DutchStemFilter

public DutchStemFilter(org.apache.lucene.analysis.TokenStream _in,
                       Set exclusiontable)
Builds a DutchStemFilter that uses an exclusion table.


DutchStemFilter

public DutchStemFilter(org.apache.lucene.analysis.TokenStream _in,
                       Set exclusiontable,
                       Map stemdictionary)
Parameters:
stemdictionary - Dictionary of word stem pairs, that overrule the algorithm
Method Detail

incrementToken

public boolean incrementToken()
                       throws IOException
Returns the next token in the stream, or null at EOS

Specified by:
incrementToken in class org.apache.lucene.analysis.TokenStream
Throws:
IOException

setStemmer

public void setStemmer(DutchStemmer stemmer)
Set a alternative/custom DutchStemmer for this filter.


setExclusionTable

public void setExclusionTable(HashSet exclusiontable)
Set an alternative exclusion list for this filter.


setStemDictionary

public void setStemDictionary(HashMap dict)
Set dictionary for stemming, this dictionary overrules the algorithm, so you can correct for a particular unwanted word-stem pair.



Copyright © 2000-2010 Apache Software Foundation. All Rights Reserved.