java.lang.Object
- org.apache.lucene.util.AttributeSource
- - org.apache.lucene.analysis.TokenStream
  - - org.apache.lucene.analysis.TokenFilter
    - - org.apache.lucene.analysis.miscellaneous.HyphenatedWordsFilter

All Implemented Interfaces:: Closeable, AutoCloseable

public final class HyphenatedWordsFilter
extends TokenFilter

When the plain text is extracted from documents, we will often have many words hyphenated and broken into two lines. This is often the case with documents where narrow text columns are used, such as newsletters. In order to increase search efficiency, this filter puts hyphenated words broken into two lines back together. This filter should be used on indexing time only. Example field definition in schema.xml:

 <fieldtype name="text" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
      <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
      <filter class="solr.StopFilterFactory" ignoreCase="true"/>
      <filter class="solr.HyphenatedWordsFilterFactory"/>
      <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0"/>
      <filter class="solr.LowerCaseFilterFactory"/>
      <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
  <analyzer type="query">
      <tokenizer class="solr.WhitespaceTokenizerFactory"/>
      <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
      <filter class="solr.StopFilterFactory" ignoreCase="true"/>
      <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0"/>
      <filter class="solr.LowerCaseFilterFactory"/>
      <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
 </fieldtype>

- Nested Class Summary
  - Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource
    AttributeSource.State
- Field Summary
  - Fields inherited from class org.apache.lucene.analysis.TokenFilter
    input
  - Fields inherited from class org.apache.lucene.analysis.TokenStream
    DEFAULT_TOKEN_ATTRIBUTE_FACTORY
- Constructor Summary
  
  Constructors
  Constructor Description
  
  HyphenatedWordsFilter(TokenStream in)
  Creates a new HyphenatedWordsFilter
- Method Summary
  
  All Methods Instance Methods Concrete Methods
  Modifier and Type Method Description
  
  boolean incrementToken()
  
  void reset()
  - Methods inherited from class org.apache.lucene.analysis.TokenFilter
    close, end
  - Methods inherited from class org.apache.lucene.util.AttributeSource
    addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toString
  - Methods inherited from class java.lang.Object
    clone, finalize, getClass, notify, notifyAll, wait, wait, wait

- Constructor Detail
  - HyphenatedWordsFilter
```
public HyphenatedWordsFilter(TokenStream in)
```
    Creates a new HyphenatedWordsFilter
    
    Parameters:
    
    in - TokenStream that will be filtered
- Method Detail
  - incrementToken
```
public boolean incrementToken()
                       throws IOException
```
    Specified by:
    
    incrementToken in class TokenStream
    
    Throws:
    
    IOException
  - reset
```
public void reset()
           throws IOException
```
    Overrides:
    
    reset in class TokenFilter
    
    Throws:
    
    IOException

All Methods Instance Methods Concrete Methods
Modifier and Type	Method	Description
`boolean`	`incrementToken()`
`void`	`reset()`

Class HyphenatedWordsFilter

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource

Field Summary

Fields inherited from class org.apache.lucene.analysis.TokenFilter

Fields inherited from class org.apache.lucene.analysis.TokenStream

Constructor Summary

Method Summary

Methods inherited from class org.apache.lucene.analysis.TokenFilter

Methods inherited from class org.apache.lucene.util.AttributeSource

Methods inherited from class java.lang.Object

Constructor Detail

HyphenatedWordsFilter

Method Detail

incrementToken

reset