org.apache.solr.handler.dataimport
Class XPathEntityProcessor

java.lang.Object
  extended by org.apache.solr.handler.dataimport.EntityProcessor
      extended by org.apache.solr.handler.dataimport.EntityProcessorBase
          extended by org.apache.solr.handler.dataimport.XPathEntityProcessor

public class XPathEntityProcessor
extends EntityProcessorBase

An implementation of EntityProcessor which uses a streaming xpath parser to extract values out of XML documents. It is typically used in conjunction with URLDataSource or FileDataSource.

Refer to http://wiki.apache.org/solr/DataImportHandler for more details.

This API is experimental and may change in the future.

Since:
solr 1.3
See Also:
XPathRecordReader

Field Summary
protected  int blockingQueueSize
           
protected  int blockingQueueTimeOut
           
protected  TimeUnit blockingQueueTimeOutUnits
           
static String COMMON_FIELD
           
protected  List<String> commonFields
           
protected  DataSource<Reader> dataSource
           
static String FOR_EACH
           
static String HAS_MORE
           
static String NEXT_URL
           
protected  List<String> placeHolderVariables
           
protected  Thread publisherThread
           
static String STREAM
           
protected  boolean streamRows
           
static String URL
           
static String USE_SOLR_ADD_SCHEMA
           
protected  boolean useSolrAddXml
           
static String XPATH
           
static String XPATH_FIELD_NAME
           
static String XSL
           
protected  Transformer xslTransformer
           
 
Fields inherited from class org.apache.solr.handler.dataimport.EntityProcessorBase
ABORT, cacheSupport, context, CONTINUE, entityName, isFirstInit, ON_ERROR, onError, query, rowIterator, SKIP, SKIP_DOC, TRANSFORM_ROW, TRANSFORMER
 
Constructor Summary
XPathEntityProcessor()
           
 
Method Summary
 void init(Context context)
          This method is called when it starts processing an entity.
 Map<String,Object> nextRow()
          For a simple implementation, this is the only method that the sub-class should implement.
 void postTransform(Map<String,Object> r)
          Invoked after the transformers are invoked.
protected  Map<String,Object> readRow(Map<String,Object> record, String xpath)
           
 
Methods inherited from class org.apache.solr.handler.dataimport.EntityProcessorBase
destroy, firstInit, getNext, initCache, nextDeletedRowKey, nextModifiedParentRowKey, nextModifiedRowKey
 
Methods inherited from class org.apache.solr.handler.dataimport.EntityProcessor
close
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

placeHolderVariables

protected List<String> placeHolderVariables

commonFields

protected List<String> commonFields

dataSource

protected DataSource<Reader> dataSource

xslTransformer

protected Transformer xslTransformer

useSolrAddXml

protected boolean useSolrAddXml

streamRows

protected boolean streamRows

blockingQueueTimeOut

protected int blockingQueueTimeOut

blockingQueueTimeOutUnits

protected TimeUnit blockingQueueTimeOutUnits

blockingQueueSize

protected int blockingQueueSize

publisherThread

protected Thread publisherThread

URL

public static final String URL
See Also:
Constant Field Values

HAS_MORE

public static final String HAS_MORE
See Also:
Constant Field Values

NEXT_URL

public static final String NEXT_URL
See Also:
Constant Field Values

XPATH_FIELD_NAME

public static final String XPATH_FIELD_NAME
See Also:
Constant Field Values

FOR_EACH

public static final String FOR_EACH
See Also:
Constant Field Values

XPATH

public static final String XPATH
See Also:
Constant Field Values

COMMON_FIELD

public static final String COMMON_FIELD
See Also:
Constant Field Values

USE_SOLR_ADD_SCHEMA

public static final String USE_SOLR_ADD_SCHEMA
See Also:
Constant Field Values

XSL

public static final String XSL
See Also:
Constant Field Values

STREAM

public static final String STREAM
See Also:
Constant Field Values
Constructor Detail

XPathEntityProcessor

public XPathEntityProcessor()
Method Detail

init

public void init(Context context)
Description copied from class: EntityProcessor
This method is called when it starts processing an entity. When it comes back to the entity it is called again. So it can reset anything at that point. For a rootmost entity this is called only once for an ingestion. For sub-entities , this is called multiple once for each row from its parent entity

Overrides:
init in class EntityProcessorBase
Parameters:
context - The current context

nextRow

public Map<String,Object> nextRow()
Description copied from class: EntityProcessorBase
For a simple implementation, this is the only method that the sub-class should implement. This is intended to stream rows one-by-one. Return null to signal end of rows

Overrides:
nextRow in class EntityProcessorBase
Returns:
a row where the key is the name of the field and value can be any Object or a Collection of objects. Return null to signal end of rows

postTransform

public void postTransform(Map<String,Object> r)
Description copied from class: EntityProcessor
Invoked after the transformers are invoked. EntityProcessors can add, remove or modify values added by Transformers in this method.

Overrides:
postTransform in class EntityProcessor
Parameters:
r - The transformed row

readRow

protected Map<String,Object> readRow(Map<String,Object> record,
                                     String xpath)


Copyright © 2000-2013 Apache Software Foundation. All Rights Reserved.