Package org.apache.lucene.benchmark.byTask.feeds
Sources for benchmark inputs: documents and queries.
-
Interface Summary Interface Description HTMLParser HTML Parsing Interface for test purposesQueryMaker Create queries for the test.SpatialDocMaker.ShapeConverter Converts one shape to another. -
Class Summary Class Description AbstractQueryMaker Abstract base query maker.ContentItemsSource Base class for source of data for benchmarkingContentSource Represents content from a specified source, such as TREC, Reuters etc.DemoHTMLParser Simple HTML Parser extracting title, meta tags, and body text that is based on NekoHTML.DemoHTMLParser.Parser The actual parser to read HTML documentsDirContentSource AContentSource
using the Dir collection for its input.DirContentSource.Iterator Iterator over the files in the directoryDocData Output of parsing (e.g.DocMaker CreatesDocument
objects.DocMaker.DocState Document state, supports reuse of field instances across documents (seereuseFields
parameter).EnwikiContentSource AContentSource
which reads the English Wikipedia dump.EnwikiQueryMaker A QueryMaker that uses common and uncommon actual Wikipedia queries for searching the English Wikipedia collection.FacetSource Source items for facets.FileBasedQueryMaker Create queries from a FileReader.GeonamesLineParser A line parser for Geonames.org data.LineDocSource AContentSource
reading one line at a time as aDocument
from a single file.LineDocSource.HeaderLineParser LineDocSource.LineParser
which sets field names and order by the header - any header - of the lines file.LineDocSource.LineParser Reader of a single input line intoDocData
.LineDocSource.SimpleLineParser LineDocSource.LineParser
which ignores the header passed to its constructor and assumes simply that field names and their order are the same as inWriteLineDocTask.DEFAULT_FIELDS
LongToEnglishContentSource Creates documents whose content is along
number starting fromLong.MIN_VALUE
+ 10LongToEnglishQueryMaker Creates queries whose content is a spelled-outlong
number starting fromLong.MIN_VALUE
+ 10RandomFacetSource Simple implementation of a random facet sourceReutersContentSource AContentSource
reading from the Reuters collection.ReutersQueryMaker A QueryMaker that makes queries devised manually (by Grant Ingersoll) for searching in the Reuters collection.SimpleQueryMaker A QueryMaker that makes queries for a collection created usingSingleDocSource
.SimpleSloppyPhraseQueryMaker Create sloppy phrase queries for performance test, in an index created using simple doc maker.SingleDocSource Creates the same document each timeSingleDocSource.getNextDocData(DocData)
is called.SortableSingleDocSource Adds fields appropriate for sorting: country, random_string and sort_field (int).SpatialDocMaker Indexes spatial data according to a configuredSpatialStrategy
with optional shape transformation via a configuredSpatialDocMaker.ShapeConverter
.SpatialFileQueryMaker Reads spatial data from the body field docs from an internally createdLineDocSource
.TrecContentSource Implements aContentSource
over the TREC collection.TrecDocParser Parser for trec doc content, invoked on doc text excluding <DOC> and <DOCNO> which are handled in TrecContentSource.TrecFBISParser Parser for the FBIS docs in trec disks 4+5 collection formatTrecFR94Parser Parser for the FR94 docs in trec disks 4+5 collection formatTrecFTParser Parser for the FT docs in trec disks 4+5 collection formatTrecGov2Parser Parser for the GOV2 collection formatTrecLATimesParser Parser for the FT docs in trec disks 4+5 collection formatTrecParserByPath Parser for trec docs which selects the parser to apply according to the source files path, defaulting toTrecGov2Parser
. -
Enum Summary Enum Description TrecDocParser.ParsePathType Types of trec parse paths, -
Exception Summary Exception Description NoMoreDataException Exception indicating there is no more data.