Class TrecParserByPath
- java.lang.Object
-
- org.apache.lucene.benchmark.byTask.feeds.TrecDocParser
-
- org.apache.lucene.benchmark.byTask.feeds.TrecParserByPath
-
public class TrecParserByPath extends TrecDocParser
Parser for trec docs which selects the parser to apply according to the source files path, defaulting toTrecGov2Parser
.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.benchmark.byTask.feeds.TrecDocParser
TrecDocParser.ParsePathType
-
-
Field Summary
-
Fields inherited from class org.apache.lucene.benchmark.byTask.feeds.TrecDocParser
DEFAULT_PATH_TYPE
-
-
Constructor Summary
Constructors Constructor Description TrecParserByPath()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description DocData
parse(DocData docData, String name, TrecContentSource trecSrc, StringBuilder docBuf, TrecDocParser.ParsePathType pathType)
parse the text prepared in docBuf into a result DocData, no synchronization is required.-
Methods inherited from class org.apache.lucene.benchmark.byTask.feeds.TrecDocParser
extract, pathType, stripTags, stripTags
-
-
-
-
Method Detail
-
parse
public DocData parse(DocData docData, String name, TrecContentSource trecSrc, StringBuilder docBuf, TrecDocParser.ParsePathType pathType) throws IOException
Description copied from class:TrecDocParser
parse the text prepared in docBuf into a result DocData, no synchronization is required.- Specified by:
parse
in classTrecDocParser
- Parameters:
docData
- reusable resultname
- name that should be set to the resulttrecSrc
- calling trec content sourcedocBuf
- text to parsepathType
- type of parsed file, or null if unknown - may be used by parsers to alter their behavior according to the file path type.- Throws:
IOException
-
-