ICUFoldingFilter (Lucene 8.9.0 API)

Skip navigation links

Prev Class
Next Class

All Classes

Summary:
Nested |
Field |
Constr |
Method

Detail:
Field |
Constr |
Method

java.lang.Object
- org.apache.lucene.util.AttributeSource
- - org.apache.lucene.analysis.TokenStream
  - - org.apache.lucene.analysis.TokenFilter
    - - org.apache.lucene.analysis.icu.ICUNormalizer2Filter
      - org.apache.lucene.analysis.icu.ICUFoldingFilter

All Implemented Interfaces:

Closeable, AutoCloseable
```
public final class ICUFoldingFilter
extends ICUNormalizer2Filter
```
A TokenFilter that applies search term folding to Unicode text, applying foldings from UTR#30 Character Foldings.
This filter applies the following foldings from the report to unicode text:
- Accent removal
- Case folding
- Canonical duplicates folding
- Dashes folding
- Diacritic removal (including stroke, hook, descender)
- Greek letterforms folding
- Han Radical folding
- Hebrew Alternates folding
- Jamo folding
- Letterforms folding
- Math symbol folding
- Multigraph Expansions: All
- Native digit folding
- No-break folding
- Overline folding
- Positional forms folding
- Small forms folding
- Space folding
- Spacing Accents folding
- Subscript folding
- Superscript folding
- Suzhou Numeral folding
- Symbol folding
- Underline folding
- Vertical forms folding
- Width folding
Additionally, Default Ignorables are removed, and text is normalized to NFKC. All foldings, case folding, and normalization mappings are applied recursively to ensure a fully folded and normalized result.

A normalizer with additional settings such as a filter that lists characters not to be normalized can be passed in the constructor.

Nested Class Summary
- Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource
  AttributeSource.State

Field Summary

Fields
Modifier and Type	Field and Description
`static com.ibm.icu.text.Normalizer2`	`NORMALIZER` A normalizer for search term folding to Unicode text, applying foldings from UTR#30 Character Foldings.

Fields inherited from class org.apache.lucene.analysis.TokenFilter
input

Fields inherited from class org.apache.lucene.analysis.TokenStream
DEFAULT_TOKEN_ATTRIBUTE_FACTORY

Constructor Summary

Constructors
Constructor and Description
`ICUFoldingFilter(TokenStream input)` Create a new ICUFoldingFilter on the specified input
`ICUFoldingFilter(TokenStream input, com.ibm.icu.text.Normalizer2 normalizer)` Create a new ICUFoldingFilter on the specified input with the specified normalizer

Method Summary
- Methods inherited from class org.apache.lucene.analysis.icu.ICUNormalizer2Filter
  incrementToken
- Methods inherited from class org.apache.lucene.analysis.TokenFilter
  close, end, reset
- Methods inherited from class org.apache.lucene.util.AttributeSource
  addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toString
- Methods inherited from class java.lang.Object
  clone, finalize, getClass, notify, notifyAll, wait, wait, wait

- Field Detail
  - NORMALIZER
```
public static final com.ibm.icu.text.Normalizer2 NORMALIZER
```
    A normalizer for search term folding to Unicode text, applying foldings from UTR#30 Character Foldings.
- Constructor Detail
  - ICUFoldingFilter
```
public ICUFoldingFilter(TokenStream input)
```
    Create a new ICUFoldingFilter on the specified input
  - ICUFoldingFilter
```
public ICUFoldingFilter(TokenStream input,
                        com.ibm.icu.text.Normalizer2 normalizer)
```
    Create a new ICUFoldingFilter on the specified input with the specified normalizer

Skip navigation links

Prev Class
Next Class

All Classes

Summary:
Nested |
Field |
Constr |
Method

Detail:
Field |
Constr |
Method

Copyright © 2000-2021 Apache Software Foundation. All Rights Reserved.