Class LineDocSource
- java.lang.Object
-
- org.apache.lucene.benchmark.byTask.feeds.ContentItemsSource
-
- org.apache.lucene.benchmark.byTask.feeds.ContentSource
-
- org.apache.lucene.benchmark.byTask.feeds.LineDocSource
-
- All Implemented Interfaces:
Closeable
,AutoCloseable
public class LineDocSource extends ContentSource
AContentSource
reading one line at a time as aDocument
from a single file. This saves IO cost (over DirContentSource) of recursing through a directory and opening a new file for every document.
The expected format of each line is (arguments are separated by <TAB>): title, date, body. If a line is read in a different format, aRuntimeException
will be thrown. In general, you should use this content source for files that were created withWriteLineDocTask
.
Config properties:- docs.file=<path to the file>
- content.source.encoding - default to UTF-8.
- line.parser - default to
LineDocSource.HeaderLineParser
if a header line exists which differs fromWriteLineDocTask.DEFAULT_FIELDS
and toLineDocSource.SimpleLineParser
otherwise.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
LineDocSource.HeaderLineParser
LineDocSource.LineParser
which sets field names and order by the header - any header - of the lines file.static class
LineDocSource.LineParser
Reader of a single input line intoDocData
.static class
LineDocSource.SimpleLineParser
LineDocSource.LineParser
which ignores the header passed to its constructor and assumes simply that field names and their order are the same as inWriteLineDocTask.DEFAULT_FIELDS
-
Field Summary
-
Fields inherited from class org.apache.lucene.benchmark.byTask.feeds.ContentItemsSource
encoding, forever, logStep, verbose
-
-
Constructor Summary
Constructors Constructor Description LineDocSource()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
close()
Called when reading from this content source is no longer required.DocData
getNextDocData(DocData docData)
Returns the nextDocData
from the content source.void
resetInputs()
Resets the input for this content source, so that the test would behave as if it was just started, input-wise.void
setConfig(Config config)
Sets theConfig
for this content source.-
Methods inherited from class org.apache.lucene.benchmark.byTask.feeds.ContentItemsSource
addBytes, addItem, collectFiles, getBytesCount, getConfig, getItemsCount, getTotalBytesCount, getTotalItemsCount, printStatistics, shouldLog
-
-
-
-
Method Detail
-
close
public void close() throws IOException
Description copied from class:ContentItemsSource
Called when reading from this content source is no longer required.- Specified by:
close
in interfaceAutoCloseable
- Specified by:
close
in interfaceCloseable
- Specified by:
close
in classContentItemsSource
- Throws:
IOException
-
getNextDocData
public DocData getNextDocData(DocData docData) throws NoMoreDataException, IOException
Description copied from class:ContentSource
Returns the nextDocData
from the content source. Implementations must account for multi-threading, as multiple threads can call this method simultaneously.- Specified by:
getNextDocData
in classContentSource
- Throws:
NoMoreDataException
IOException
-
resetInputs
public void resetInputs() throws IOException
Description copied from class:ContentItemsSource
Resets the input for this content source, so that the test would behave as if it was just started, input-wise.NOTE: the default implementation resets the number of bytes and items generated since the last reset, so it's important to call super.resetInputs in case you override this method.
- Overrides:
resetInputs
in classContentItemsSource
- Throws:
IOException
-
setConfig
public void setConfig(Config config)
Description copied from class:ContentItemsSource
Sets theConfig
for this content source. If you override this method, you must call super.setConfig.- Overrides:
setConfig
in classContentItemsSource
-
-