org.apache.lucene.benchmark.byTask.feeds
Class DemoHTMLParser
java.lang.Object
org.apache.lucene.benchmark.byTask.feeds.DemoHTMLParser
- All Implemented Interfaces:
- HTMLParser
public class DemoHTMLParser
- extends Object
- implements HTMLParser
HTML Parser that is based on Lucene's demo HTML parser.
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
DemoHTMLParser
public DemoHTMLParser()
parse
public DocData parse(DocData docData,
String name,
Date date,
String title,
Reader reader,
DateFormat dateFormat)
throws IOException,
InterruptedException
- Description copied from interface:
HTMLParser
- Parse the input Reader and return DocData.
The provided name,title,date are used for the result, unless when they're null,
in which case an attempt is made to set them from the parsed data.
- Specified by:
parse
in interface HTMLParser
- Parameters:
docData
- result reusedname
- name of the result doc data.date
- date of the result doc data. If null, attempt to set by parsed data.title
- title of the result doc data. If null, attempt to set by parsed data.reader
- reader of html text to parse.dateFormat
- date formatter to use for extracting the date.
- Returns:
- Parsed doc data.
- Throws:
IOException
InterruptedException
Copyright © 2000-2011 Apache Software Foundation. All Rights Reserved.