LowerCaseTokenizer (Lucene 3.6.0 API)

java.lang.Object
- org.apache.lucene.util.AttributeSource
- - org.apache.lucene.analysis.TokenStream
  - - org.apache.lucene.analysis.Tokenizer
    - - org.apache.lucene.analysis.CharTokenizer
      - org.apache.lucene.analysis.LetterTokenizer
        
        org.apache.lucene.analysis.LowerCaseTokenizer

All Implemented Interfaces:

Closeable
```
public final class LowerCaseTokenizer
extends LetterTokenizer
```
LowerCaseTokenizer performs the function of LetterTokenizer and LowerCaseFilter together. It divides text at non-letters and converts them to lower case. While it is functionally equivalent to the combination of LetterTokenizer and LowerCaseFilter, there is a performance advantage to doing the two tasks at once, hence this (redundant) implementation.
Note: this does a decent job for most European languages, but does a terrible job for some Asian languages, where words are not separated by spaces.

You must specify the required Version compatibility when creating LowerCaseTokenizer:
- As of 3.1, CharTokenizer uses an int based API to normalize and detect token characters. See CharTokenizer.isTokenChar(int) and CharTokenizer.normalize(int) for details.

Nested Class Summary
- Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource
  AttributeSource.AttributeFactory, AttributeSource.State

Field Summary
- Fields inherited from class org.apache.lucene.analysis.Tokenizer
  input

Constructor Summary

Constructors
Constructor and Description
`LowerCaseTokenizer(AttributeSource.AttributeFactory factory, Reader in)` Deprecated. use `LowerCaseTokenizer(Version, AttributeSource.AttributeFactory, Reader)` instead. This will be removed in Lucene 4.0.
`LowerCaseTokenizer(AttributeSource source, Reader in)` Deprecated. use `LowerCaseTokenizer(Version, AttributeSource, Reader)` instead. This will be removed in Lucene 4.0.
`LowerCaseTokenizer(Reader in)` Deprecated. use `LowerCaseTokenizer(Version, Reader)` instead. This will be removed in Lucene 4.0.
`LowerCaseTokenizer(Version matchVersion, AttributeSource.AttributeFactory factory, Reader in)` Construct a new LowerCaseTokenizer using a given `AttributeSource.AttributeFactory`.
`LowerCaseTokenizer(Version matchVersion, AttributeSource source, Reader in)` Construct a new LowerCaseTokenizer using a given `AttributeSource`.
`LowerCaseTokenizer(Version matchVersion, Reader in)` Construct a new LowerCaseTokenizer.

Method Summary

Methods
Modifier and Type Method and Description

protected int normalize(int c)
Converts char to lower case Character.toLowerCase(int).
- Methods inherited from class org.apache.lucene.analysis.LetterTokenizer
  isTokenChar
- Methods inherited from class org.apache.lucene.analysis.CharTokenizer
  end, incrementToken, isTokenChar, normalize, reset
- Methods inherited from class org.apache.lucene.analysis.Tokenizer
  close, correctOffset
- Methods inherited from class org.apache.lucene.analysis.TokenStream
  reset
- Methods inherited from class org.apache.lucene.util.AttributeSource
  addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString
- Methods inherited from class java.lang.Object
  clone, finalize, getClass, notify, notifyAll, wait, wait, wait

Methods
Modifier and Type	Method and Description
`protected int`	`normalize(int c)` Converts char to lower case `Character.toLowerCase(int)`.

- Constructor Detail
  - LowerCaseTokenizer
```
public LowerCaseTokenizer(Version matchVersion,
                  Reader in)
```
    Construct a new LowerCaseTokenizer.
    
    Parameters:
    matchVersion - Lucene version to match See above
    in - the input to split up into tokens
  - LowerCaseTokenizer
```
public LowerCaseTokenizer(Version matchVersion,
                  AttributeSource source,
                  Reader in)
```
    Construct a new LowerCaseTokenizer using a given AttributeSource.
    
    Parameters:
    matchVersion - Lucene version to match See above
    source - the attribute source to use for this Tokenizer
    in - the input to split up into tokens
  - LowerCaseTokenizer
```
public LowerCaseTokenizer(Version matchVersion,
                  AttributeSource.AttributeFactory factory,
                  Reader in)
```
    Construct a new LowerCaseTokenizer using a given AttributeSource.AttributeFactory.
    
    Parameters:
    matchVersion - Lucene version to match See above
    factory - the attribute factory to use for this Tokenizer
    in - the input to split up into tokens
  - LowerCaseTokenizer
```
@Deprecated
public LowerCaseTokenizer(Reader in)
```
    Deprecated. use LowerCaseTokenizer(Version, Reader) instead. This will be removed in Lucene 4.0.
    
    Construct a new LowerCaseTokenizer.
  - LowerCaseTokenizer
```
@Deprecated
public LowerCaseTokenizer(AttributeSource source,
                             Reader in)
```
    Deprecated. use LowerCaseTokenizer(Version, AttributeSource, Reader) instead. This will be removed in Lucene 4.0.
    
    Construct a new LowerCaseTokenizer using a given AttributeSource.
  - LowerCaseTokenizer
```
@Deprecated
public LowerCaseTokenizer(AttributeSource.AttributeFactory factory,
                             Reader in)
```
    Deprecated. use LowerCaseTokenizer(Version, AttributeSource.AttributeFactory, Reader) instead. This will be removed in Lucene 4.0.
    
    Construct a new LowerCaseTokenizer using a given AttributeSource.AttributeFactory.
- Method Detail
  - normalize
```
protected int normalize(int c)
```
    Converts char to lower case Character.toLowerCase(int).
    
    Overrides:
    
    normalize in class CharTokenizer

Class LowerCaseTokenizer

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource

Field Summary

Fields inherited from class org.apache.lucene.analysis.Tokenizer

Constructor Summary

Method Summary

Methods inherited from class org.apache.lucene.analysis.LetterTokenizer

Methods inherited from class org.apache.lucene.analysis.CharTokenizer

Methods inherited from class org.apache.lucene.analysis.Tokenizer

Methods inherited from class org.apache.lucene.analysis.TokenStream

Methods inherited from class org.apache.lucene.util.AttributeSource

Methods inherited from class java.lang.Object

Constructor Detail

LowerCaseTokenizer

LowerCaseTokenizer

LowerCaseTokenizer

LowerCaseTokenizer

LowerCaseTokenizer

LowerCaseTokenizer

Method Detail

normalize