org.apache.lucene.benchmark.byTask.feeds
Class EnwikiContentSource

java.lang.Object
  extended by org.apache.lucene.benchmark.byTask.feeds.ContentItemsSource
      extended by org.apache.lucene.benchmark.byTask.feeds.ContentSource
          extended by org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource
All Implemented Interfaces:
Closeable

public class EnwikiContentSource
extends ContentSource

A ContentSource which reads the English Wikipedia dump. You can read the .bz2 file directly (it will be decompressed on the fly). Config properties:


Field Summary
 
Fields inherited from class org.apache.lucene.benchmark.byTask.feeds.ContentItemsSource
encoding, forever, logStep, verbose
 
Constructor Summary
EnwikiContentSource()
           
 
Method Summary
 void close()
          Called when reading from this content source is no longer required.
 DocData getNextDocData(DocData docData)
          Returns the next DocData from the content source.
 void resetInputs()
          Resets the input for this content source, so that the test would behave as if it was just started, input-wise.
 void setConfig(Config config)
          Sets the Config for this content source.
 
Methods inherited from class org.apache.lucene.benchmark.byTask.feeds.ContentItemsSource
addBytes, addItem, collectFiles, getBytesCount, getConfig, getItemsCount, getTotalBytesCount, getTotalItemsCount, printStatistics, shouldLog
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

EnwikiContentSource

public EnwikiContentSource()
Method Detail

close

public void close()
           throws IOException
Description copied from class: ContentItemsSource
Called when reading from this content source is no longer required.

Specified by:
close in interface Closeable
Specified by:
close in class ContentItemsSource
Throws:
IOException

getNextDocData

public DocData getNextDocData(DocData docData)
                       throws NoMoreDataException,
                              IOException
Description copied from class: ContentSource
Returns the next DocData from the content source. Implementations must account for multi-threading, as multiple threads can call this method simultaneously.

Specified by:
getNextDocData in class ContentSource
Throws:
NoMoreDataException
IOException

resetInputs

public void resetInputs()
                 throws IOException
Description copied from class: ContentItemsSource
Resets the input for this content source, so that the test would behave as if it was just started, input-wise.

NOTE: the default implementation resets the number of bytes and items generated since the last reset, so it's important to call super.resetInputs in case you override this method.

Overrides:
resetInputs in class ContentItemsSource
Throws:
IOException

setConfig

public void setConfig(Config config)
Description copied from class: ContentItemsSource
Sets the Config for this content source. If you override this method, you must call super.setConfig.

Overrides:
setConfig in class ContentItemsSource


Copyright © 2000-2011 Apache Software Foundation. All Rights Reserved.