Class HTMLStripCharFilter
java.lang.Object
java.io.Reader
org.apache.lucene.analysis.CharFilter
org.apache.lucene.analysis.charfilter.BaseCharFilter
org.apache.lucene.analysis.charfilter.HTMLStripCharFilter
- All Implemented Interfaces:
Closeable
,AutoCloseable
,Readable
A CharFilter that wraps another Reader and attempts to strip out HTML constructs.
-
Field Summary
Fields inherited from class org.apache.lucene.analysis.CharFilter
input
-
Constructor Summary
ConstructorDescriptionCreates a new scannerHTMLStripCharFilter
(Reader in, Set<String> escapedTags) Creates a new HTMLStripCharFilter over the provided Reader with the specified start and end tags. -
Method Summary
Methods inherited from class org.apache.lucene.analysis.charfilter.BaseCharFilter
addOffCorrectMap, correct, getLastCumulativeDiff
Methods inherited from class org.apache.lucene.analysis.CharFilter
correctOffset
Methods inherited from class java.io.Reader
mark, markSupported, nullReader, read, read, ready, reset, skip, transferTo
-
Constructor Details
-
HTMLStripCharFilter
Creates a new HTMLStripCharFilter over the provided Reader with the specified start and end tags.- Parameters:
in
- Reader to strip html tags from.escapedTags
- Tags in this set (both start and end tags) will not be filtered out.
-
HTMLStripCharFilter
Creates a new scanner- Parameters:
in
- the java.io.Reader to read input from.
-
-
Method Details
-
read
- Overrides:
read
in classReader
- Throws:
IOException
-
read
- Specified by:
read
in classReader
- Throws:
IOException
-
close
- Specified by:
close
in interfaceAutoCloseable
- Specified by:
close
in interfaceCloseable
- Overrides:
close
in classCharFilter
- Throws:
IOException
-