public class TrecGov2Parser extends TrecDocParser
TrecDocParser.ParsePathTypeDEFAULT_PATH_TYPE| Constructor and Description |
|---|
TrecGov2Parser() |
| Modifier and Type | Method and Description |
|---|---|
DocData |
parse(DocData docData,
String name,
TrecContentSource trecSrc,
StringBuilder docBuf,
TrecDocParser.ParsePathType pathType)
parse the text prepared in docBuf into a result DocData,
no synchronization is required.
|
extract, pathType, stripTags, stripTagspublic DocData parse(DocData docData, String name, TrecContentSource trecSrc, StringBuilder docBuf, TrecDocParser.ParsePathType pathType) throws IOException, InterruptedException
TrecDocParserparse in class TrecDocParserdocData - reusable resultname - name that should be set to the resulttrecSrc - calling trec content sourcedocBuf - text to parsepathType - type of parsed file, or null if unknown - may be used by
parsers to alter their behavior according to the file path type.IOExceptionInterruptedException