Class WriteLineDocTask
- java.lang.Object
-
- org.apache.lucene.benchmark.byTask.tasks.PerfTask
-
- org.apache.lucene.benchmark.byTask.tasks.WriteLineDocTask
-
- All Implemented Interfaces:
Cloneable
- Direct Known Subclasses:
WriteEnwikiLineDocTask
public class WriteLineDocTask extends PerfTask
A task which writes documents, one line per document. Each line is in the following format: title <TAB> date <TAB> body. The output of this task can be consumed byLineDocSource
and is intended to save the IO overhead of opening a file per document to be indexed.The format of the output is set according to the output file extension. Compression is recommended when the output file is expected to be large. See info on file extensions in
StreamUtils.Type
Supports the following parameters:
- line.file.out - the name of the file to write the output to. That parameter is mandatory. NOTE: the file is re-created.
- line.fields - which fields should be written in each line. (optional, default:
DEFAULT_FIELDS
). - sufficient.fields - list of field names, separated by comma, which, if all of them
are missing, the document will be skipped. For example, to require that at least one of
f1,f2 is not empty, specify: "f1,f2" in this field. To specify that no field is required,
i.e. that even empty docs should be emitted, specify ",". (optional, default:
DEFAULT_SUFFICIENT_FIELDS
).
-
-
Field Summary
Fields Modifier and Type Field Description static String[]
DEFAULT_FIELDS
Fields to be written by defaultstatic String
DEFAULT_SUFFICIENT_FIELDS
Default fields which at least one of them is required to not skip the doc.static String
FIELDS_HEADER_INDICATOR
protected String
fname
static char
SEP
-
Constructor Summary
Constructors Constructor Description WriteLineDocTask(PerfRunData runData)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
close()
int
doLogic()
Perform the task once (ignoring repetitions specification) Return number of work items done by this task.protected String
getLogMessage(int recsCount)
protected PrintWriter
lineFileOut(Document doc)
Selects output line file by written doc.void
setParams(String params)
Set the params (docSize only)boolean
supportsParams()
Sub classes that support parameters must override this method to return true.protected void
writeHeader(PrintWriter out)
Write header to the lines file - indicating how to read the file later.-
Methods inherited from class org.apache.lucene.benchmark.byTask.tasks.PerfTask
clone, getAlgLineNum, getBackgroundDeltaPriority, getDepth, getName, getParams, getRunData, getRunInBackground, isDisableCounting, runAndMaybeStats, setAlgLineNum, setDepth, setDisableCounting, setName, setRunInBackground, setup, shouldNeverLogAtStart, shouldNotRecordStats, stopNow, tearDown, toString
-
-
-
-
Field Detail
-
FIELDS_HEADER_INDICATOR
public static final String FIELDS_HEADER_INDICATOR
- See Also:
- Constant Field Values
-
SEP
public static final char SEP
- See Also:
- Constant Field Values
-
DEFAULT_FIELDS
public static final String[] DEFAULT_FIELDS
Fields to be written by default
-
DEFAULT_SUFFICIENT_FIELDS
public static final String DEFAULT_SUFFICIENT_FIELDS
Default fields which at least one of them is required to not skip the doc.- See Also:
- Constant Field Values
-
fname
protected final String fname
-
-
Constructor Detail
-
WriteLineDocTask
public WriteLineDocTask(PerfRunData runData) throws Exception
- Throws:
Exception
-
-
Method Detail
-
writeHeader
protected void writeHeader(PrintWriter out)
Write header to the lines file - indicating how to read the file later.
-
getLogMessage
protected String getLogMessage(int recsCount)
- Overrides:
getLogMessage
in classPerfTask
-
doLogic
public int doLogic() throws Exception
Description copied from class:PerfTask
Perform the task once (ignoring repetitions specification) Return number of work items done by this task. For indexing that can be number of docs added. For warming that can be number of scanned items, etc.
-
lineFileOut
protected PrintWriter lineFileOut(Document doc)
Selects output line file by written doc. Default: original output line file.
-
setParams
public void setParams(String params)
Set the params (docSize only)
-
supportsParams
public boolean supportsParams()
Description copied from class:PerfTask
Sub classes that support parameters must override this method to return true.- Overrides:
supportsParams
in classPerfTask
- Returns:
- true iff this task supports command line params.
-
-