org.apache.lucene.benchmark.byTask.feeds
Class ContentSource

java.lang.Object
  extended by org.apache.lucene.benchmark.byTask.feeds.ContentSource
Direct Known Subclasses:
DirContentSource, EnwikiContentSource, LineDocSource, LongToEnglishContentSource, ReutersContentSource, SingleDocSource, TrecContentSource

public abstract class ContentSource
extends Object

Represents content from a specified source, such as TREC, Reuters etc. A ContentSource is responsible for creating DocData objects for its documents to be consumed by DocMaker. It also keeps track of various statistics, such as how many documents were generated, size in bytes etc.

Supports the following configuration parameters:


Field Summary
protected  String encoding
           
protected  boolean forever
           
protected  int logStep
           
protected  boolean verbose
           
 
Constructor Summary
ContentSource()
           
 
Method Summary
protected  void addBytes(long numBytes)
          update count of bytes generated by this source
protected  void addDoc()
          update count of documents generated by this source
abstract  void close()
          Called when reading from this content source is no longer required.
protected  void collectFiles(File dir, ArrayList<File> files)
          A convenience method for collecting all the files of a content source from a given directory.
 long getBytesCount()
          Returns the number of bytes generated since last reset.
 Config getConfig()
           
 int getDocsCount()
          Returns the number of generated documents since last reset.
abstract  DocData getNextDocData(DocData docData)
          Returns the next DocData from the content source.
 long getTotalBytesCount()
          Returns the total number of bytes that were generated by this source.
 int getTotalDocsCount()
          Returns the total number of generated documents.
 void resetInputs()
          Resets the input for this content source, so that the test would behave as if it was just started, input-wise.
 void setConfig(Config config)
          Sets the Config for this content source.
protected  boolean shouldLog()
          Returns true whether it's time to log a message (depending on verbose and the number of documents generated).
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

forever

protected boolean forever

logStep

protected int logStep

verbose

protected boolean verbose

encoding

protected String encoding
Constructor Detail

ContentSource

public ContentSource()
Method Detail

addBytes

protected final void addBytes(long numBytes)
update count of bytes generated by this source


addDoc

protected final void addDoc()
update count of documents generated by this source


collectFiles

protected final void collectFiles(File dir,
                                  ArrayList<File> files)
A convenience method for collecting all the files of a content source from a given directory. The collected File instances are stored in the given files.


shouldLog

protected final boolean shouldLog()
Returns true whether it's time to log a message (depending on verbose and the number of documents generated).


close

public abstract void close()
                    throws IOException
Called when reading from this content source is no longer required.

Throws:
IOException

getBytesCount

public final long getBytesCount()
Returns the number of bytes generated since last reset.


getDocsCount

public final int getDocsCount()
Returns the number of generated documents since last reset.


getConfig

public final Config getConfig()

getNextDocData

public abstract DocData getNextDocData(DocData docData)
                                throws NoMoreDataException,
                                       IOException
Returns the next DocData from the content source.

Throws:
NoMoreDataException
IOException

getTotalBytesCount

public final long getTotalBytesCount()
Returns the total number of bytes that were generated by this source.


getTotalDocsCount

public final int getTotalDocsCount()
Returns the total number of generated documents.


resetInputs

public void resetInputs()
                 throws IOException
Resets the input for this content source, so that the test would behave as if it was just started, input-wise.

NOTE: the default implementation resets the number of bytes and documents generated since the last reset, so it's important to call super.resetInputs in case you override this method.

Throws:
IOException

setConfig

public void setConfig(Config config)
Sets the Config for this content source. If you override this method, you must call super.setConfig.



Copyright © 2000-2011 Apache Software Foundation. All Rights Reserved.