Class TrecParserByPath
java.lang.Object
org.apache.lucene.benchmark.byTask.feeds.TrecDocParser
org.apache.lucene.benchmark.byTask.feeds.TrecParserByPath
Parser for trec docs which selects the parser to apply according to the source files path,
defaulting to
TrecGov2Parser
.-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.lucene.benchmark.byTask.feeds.TrecDocParser
TrecDocParser.ParsePathType
-
Field Summary
Fields inherited from class org.apache.lucene.benchmark.byTask.feeds.TrecDocParser
DEFAULT_PATH_TYPE
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionparse
(DocData docData, String name, TrecContentSource trecSrc, StringBuilder docBuf, TrecDocParser.ParsePathType pathType) parse the text prepared in docBuf into a result DocData, no synchronization is required.Methods inherited from class org.apache.lucene.benchmark.byTask.feeds.TrecDocParser
extract, pathType, stripTags, stripTags
-
Constructor Details
-
TrecParserByPath
public TrecParserByPath()
-
-
Method Details
-
parse
public DocData parse(DocData docData, String name, TrecContentSource trecSrc, StringBuilder docBuf, TrecDocParser.ParsePathType pathType) throws IOException Description copied from class:TrecDocParser
parse the text prepared in docBuf into a result DocData, no synchronization is required.- Specified by:
parse
in classTrecDocParser
- Parameters:
docData
- reusable resultname
- name that should be set to the resulttrecSrc
- calling trec content sourcedocBuf
- text to parsepathType
- type of parsed file, or null if unknown - may be used by parsers to alter their behavior according to the file path type.- Throws:
IOException
-