org.apache.lucene.benchmark.byTask.feeds
Class TrecFR94Parser

java.lang.Object
  extended by org.apache.lucene.benchmark.byTask.feeds.TrecDocParser
      extended by org.apache.lucene.benchmark.byTask.feeds.TrecFR94Parser

public class TrecFR94Parser
extends TrecDocParser

Parser for the FR94 docs in trec disks 4+5 collection format


Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.lucene.benchmark.byTask.feeds.TrecDocParser
TrecDocParser.ParsePathType
 
Field Summary
 
Fields inherited from class org.apache.lucene.benchmark.byTask.feeds.TrecDocParser
DEFAULT_PATH_TYPE
 
Constructor Summary
TrecFR94Parser()
           
 
Method Summary
 DocData parse(DocData docData, String name, TrecContentSource trecSrc, StringBuilder docBuf, TrecDocParser.ParsePathType pathType)
          parse the text prepared in docBuf into a result DocData, no synchronization is required.
 
Methods inherited from class org.apache.lucene.benchmark.byTask.feeds.TrecDocParser
extract, pathType, stripTags, stripTags
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TrecFR94Parser

public TrecFR94Parser()
Method Detail

parse

public DocData parse(DocData docData,
                     String name,
                     TrecContentSource trecSrc,
                     StringBuilder docBuf,
                     TrecDocParser.ParsePathType pathType)
              throws IOException,
                     InterruptedException
Description copied from class: TrecDocParser
parse the text prepared in docBuf into a result DocData, no synchronization is required.

Specified by:
parse in class TrecDocParser
Parameters:
docData - reusable result
name - name that should be set to the result
trecSrc - calling trec content source
docBuf - text to parse
pathType - type of parsed file, or null if unknown - may be used by parsers to alter their behavior according to the file path type.
Throws:
IOException
InterruptedException


Copyright © 2000-2011 Apache Software Foundation. All Rights Reserved.