Class XPathRecordReader
- java.lang.Object
-
- org.apache.solr.handler.dataimport.XPathRecordReader
-
public class XPathRecordReader extends Object
A streaming xpath parser which uses StAX for XML parsing. It supports only a subset of xpath syntax.
/a/b/subject[@qualifier='fullTitle'] /a/b/subject[@qualifier=]/subtag /a/b/subject/@qualifier //a //a/b... /a//b /a//b... /a/b/c
A record is a Map<String,Object> . The key is the provided name and the value is a String or a List<String> This class is thread-safe for parsing xml. But adding fields is not thread-safe. The recommended usage is to addField() in one thread and then share the instance across threads.This API is experimental and may change in the future.
- Since:
- solr 1.3
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static interface
XPathRecordReader.Handler
Implement this interface to stream records as and when one is found.
-
Field Summary
Fields Modifier and Type Field Description static int
FLATTEN
The FLATTEN flag indicates that all text and cdata under a specific tag should be recursivly fetched and appended to the current Node's value.
-
Constructor Summary
Constructors Constructor Description XPathRecordReader(String forEachXpath)
A constructor called with a '|' separated list of Xpath expressions which define sub sections of the XML stream that are to be emitted as separate records.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description XPathRecordReader
addField(String name, String xpath, boolean multiValued)
A wrapper aroundaddField0
to create a series of Nodes based on the supplied Xpath and a given fieldName.XPathRecordReader
addField(String name, String xpath, boolean multiValued, int flags)
A wrapper aroundaddField0
to create a series of Nodes based on the supplied Xpath and a given fieldName.List<Map<String,Object>>
getAllRecords(Reader r)
UsesstreamRecords
to parse the XML source but with a handler that collects all the emitted records into a single List which is returned upon completion.void
streamRecords(Reader r, XPathRecordReader.Handler handler)
Creates an XML stream reader on top of whatever reader has been configured.
-
-
-
Field Detail
-
FLATTEN
public static final int FLATTEN
The FLATTEN flag indicates that all text and cdata under a specific tag should be recursivly fetched and appended to the current Node's value.- See Also:
- Constant Field Values
-
-
Constructor Detail
-
XPathRecordReader
public XPathRecordReader(String forEachXpath)
A constructor called with a '|' separated list of Xpath expressions which define sub sections of the XML stream that are to be emitted as separate records.- Parameters:
forEachXpath
- The XPATH for which a record is emitted. Once the xpath tag is encountered, the Node.parse method starts collecting wanted fields and at the close of the tag, a record is emitted containing all fields collected since the tag start. Once emitted the collected fields are cleared. Any fields collected in the parent tag or above will also be included in the record, but these are not cleared after emitting the record. It uses the ' | ' syntax of XPATH to pass in multiple xpaths.
-
-
Method Detail
-
addField
public XPathRecordReader addField(String name, String xpath, boolean multiValued)
A wrapper aroundaddField0
to create a series of Nodes based on the supplied Xpath and a given fieldName. The created nodes are inserted into a Node tree.- Parameters:
name
- The name for this field in the emitted recordxpath
- The xpath expression for this fieldmultiValued
- If 'true' then the emitted record will have values in a List<String>
-
addField
public XPathRecordReader addField(String name, String xpath, boolean multiValued, int flags)
A wrapper aroundaddField0
to create a series of Nodes based on the supplied Xpath and a given fieldName. The created nodes are inserted into a Node tree.- Parameters:
name
- The name for this field in the emitted recordxpath
- The xpath expression for this fieldmultiValued
- If 'true' then the emitted record will have values in a List<String>flags
- FLATTEN: Recursively combine text from all child XML elements
-
getAllRecords
public List<Map<String,Object>> getAllRecords(Reader r)
UsesstreamRecords
to parse the XML source but with a handler that collects all the emitted records into a single List which is returned upon completion.- Parameters:
r
- the stream reader- Returns:
- results a List of emitted records
-
streamRecords
public void streamRecords(Reader r, XPathRecordReader.Handler handler)
Creates an XML stream reader on top of whatever reader has been configured. Then calls parse() with a handler which is invoked forEach record emitted.- Parameters:
r
- the stream readerhandler
- The callback instance
-
-