org.apache.solr.handler.dataimport
Class LineEntityProcessor

java.lang.Object
  extended by org.apache.solr.handler.dataimport.EntityProcessor
      extended by org.apache.solr.handler.dataimport.EntityProcessorBase
          extended by org.apache.solr.handler.dataimport.LineEntityProcessor

public class LineEntityProcessor
extends EntityProcessorBase

An EntityProcessor instance which can stream lines of text read from a datasource. Options allow lines to be explicitly skipped or included in the index.

Attribute summary

Although envisioned for reading lines from a file or url, LineEntityProcessor may also be useful for dealing with change lists, where each line contains filenames which can be used by subsequent entities to parse content from those files.

Refer to http://wiki.apache.org/solr/DataImportHandler for more details.

This API is experimental and may change in the future.

Since:
solr 1.4
See Also:
Pattern

Field Summary
static String ACCEPT_LINE_REGEX
          Holds the name of entity attribute that will be parsed to obtain the pattern to be used when checking to see if a line should be returned.
static String SKIP_LINE_REGEX
          Holds the name of entity attribute that will be parsed to obtain the pattern to be used when checking to see if a line should be ignored.
static String URL
          Holds the name of entity attribute that will be parsed to obtain the filename containing the changelist.
 
Fields inherited from class org.apache.solr.handler.dataimport.EntityProcessorBase
ABORT, cacheSupport, context, CONTINUE, entityName, isFirstInit, ON_ERROR, onError, query, rowIterator, SKIP, SKIP_DOC, TRANSFORM_ROW, TRANSFORMER
 
Constructor Summary
LineEntityProcessor()
           
 
Method Summary
 void closeResources()
           
 void destroy()
          Invoked for each entity at the very end of the import to do any needed cleanup tasks.
 void init(Context context)
          Parses each of the entity attributes.
 Map<String,Object> nextRow()
          Reads lines from the url till it finds a lines that matches the optional acceptLineRegex and does not match the optional skipLineRegex.
 
Methods inherited from class org.apache.solr.handler.dataimport.EntityProcessorBase
firstInit, getNext, initCache, nextDeletedRowKey, nextModifiedParentRowKey, nextModifiedRowKey
 
Methods inherited from class org.apache.solr.handler.dataimport.EntityProcessor
close, postTransform
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

URL

public static final String URL
Holds the name of entity attribute that will be parsed to obtain the filename containing the changelist.

See Also:
Constant Field Values

ACCEPT_LINE_REGEX

public static final String ACCEPT_LINE_REGEX
Holds the name of entity attribute that will be parsed to obtain the pattern to be used when checking to see if a line should be returned.

See Also:
Constant Field Values

SKIP_LINE_REGEX

public static final String SKIP_LINE_REGEX
Holds the name of entity attribute that will be parsed to obtain the pattern to be used when checking to see if a line should be ignored.

See Also:
Constant Field Values
Constructor Detail

LineEntityProcessor

public LineEntityProcessor()
Method Detail

init

public void init(Context context)
Parses each of the entity attributes.

Overrides:
init in class EntityProcessorBase
Parameters:
context - The current context

nextRow

public Map<String,Object> nextRow()
Reads lines from the url till it finds a lines that matches the optional acceptLineRegex and does not match the optional skipLineRegex.

Overrides:
nextRow in class EntityProcessorBase
Returns:
A row containing a minimum of one field "rawLine" or null to signal end of file. The rawLine is the as line as returned by readLine() from the url. However transformers can be used to create as many other fields as required.

closeResources

public void closeResources()

destroy

public void destroy()
Description copied from class: EntityProcessor
Invoked for each entity at the very end of the import to do any needed cleanup tasks.

Overrides:
destroy in class EntityProcessorBase


Copyright © 2000-2013 Apache Software Foundation. All Rights Reserved.