Class TrecFBISParser
- java.lang.Object
-
- org.apache.lucene.benchmark.byTask.feeds.TrecDocParser
-
- org.apache.lucene.benchmark.byTask.feeds.TrecFBISParser
-
public class TrecFBISParser extends TrecDocParser
Parser for the FBIS docs in trec disks 4+5 collection format
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.benchmark.byTask.feeds.TrecDocParser
TrecDocParser.ParsePathType
-
-
Field Summary
-
Fields inherited from class org.apache.lucene.benchmark.byTask.feeds.TrecDocParser
DEFAULT_PATH_TYPE
-
-
Constructor Summary
Constructors Constructor Description TrecFBISParser()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description DocData
parse(DocData docData, String name, TrecContentSource trecSrc, StringBuilder docBuf, TrecDocParser.ParsePathType pathType)
parse the text prepared in docBuf into a result DocData, no synchronization is required.-
Methods inherited from class org.apache.lucene.benchmark.byTask.feeds.TrecDocParser
extract, pathType, stripTags, stripTags
-
-
-
-
Method Detail
-
parse
public DocData parse(DocData docData, String name, TrecContentSource trecSrc, StringBuilder docBuf, TrecDocParser.ParsePathType pathType) throws IOException
Description copied from class:TrecDocParser
parse the text prepared in docBuf into a result DocData, no synchronization is required.- Specified by:
parse
in classTrecDocParser
- Parameters:
docData
- reusable resultname
- name that should be set to the resulttrecSrc
- calling trec content sourcedocBuf
- text to parsepathType
- type of parsed file, or null if unknown - may be used by parsers to alter their behavior according to the file path type.- Throws:
IOException
-
-