org.apache.solr.handler.dataimport
Class FileListEntityProcessor

java.lang.Object
  extended by org.apache.solr.handler.dataimport.EntityProcessor
      extended by org.apache.solr.handler.dataimport.EntityProcessorBase
          extended by org.apache.solr.handler.dataimport.FileListEntityProcessor

public class FileListEntityProcessor
extends EntityProcessorBase

An EntityProcessor instance which can stream file names found in a given base directory matching patterns and returning rows containing file information.

It supports querying a give base directory by matching:

Its output can be used along with FileDataSource to read from files in file systems.

Refer to http://wiki.apache.org/solr/DataImportHandler for more details.

This API is experimental and may change in the future.

Since:
solr 1.3
See Also:
Pattern

Field Summary
static String ABSOLUTE_FILE
           
static String BASE_DIR
           
protected  String baseDir
          The baseDir given in data-config.xml after resolving any variables
static String BIGGER_THAN
           
protected  long biggerThan
          The biggerThan given in data-config as a long value
static String DIR
           
protected  String excludes
          A Regex pattern of excluded file names as given in data-config.xml after resolving any variables
static String EXCLUDES
           
static String FILE
           
static String FILE_NAME
           
protected  String fileName
          A regex pattern to identify files given in data-config.xml after resolving any variables
static String LAST_MODIFIED
           
static String NEWER_THAN
           
protected  Date newerThan
          The newerThan given in data-config as a Date
static String OLDER_THAN
           
protected  Date olderThan
          The newerThan given in data-config as a Date
static Pattern PLACE_HOLDER_PATTERN
           
protected  boolean recursive
          The recursive given in data-config.
static String RECURSIVE
           
static String SIZE
           
static String SMALLER_THAN
           
protected  long smallerThan
          The smallerThan given in data-config as a long value
 
Fields inherited from class org.apache.solr.handler.dataimport.EntityProcessorBase
ABORT, cacheSupport, context, CONTINUE, entityName, isFirstInit, ON_ERROR, onError, query, rowIterator, SKIP, SKIP_DOC, TRANSFORM_ROW, TRANSFORMER
 
Constructor Summary
FileListEntityProcessor()
           
 
Method Summary
 void init(Context context)
          This method is called when it starts processing an entity.
 Map<String,Object> nextRow()
          For a simple implementation, this is the only method that the sub-class should implement.
 
Methods inherited from class org.apache.solr.handler.dataimport.EntityProcessorBase
destroy, firstInit, getNext, initCache, nextDeletedRowKey, nextModifiedParentRowKey, nextModifiedRowKey
 
Methods inherited from class org.apache.solr.handler.dataimport.EntityProcessor
close, postTransform
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

fileName

protected String fileName
A regex pattern to identify files given in data-config.xml after resolving any variables


baseDir

protected String baseDir
The baseDir given in data-config.xml after resolving any variables


excludes

protected String excludes
A Regex pattern of excluded file names as given in data-config.xml after resolving any variables


newerThan

protected Date newerThan
The newerThan given in data-config as a Date

Note: This variable is resolved just-in-time in the nextRow() method.


olderThan

protected Date olderThan
The newerThan given in data-config as a Date


biggerThan

protected long biggerThan
The biggerThan given in data-config as a long value

Note: This variable is resolved just-in-time in the nextRow() method.


smallerThan

protected long smallerThan
The smallerThan given in data-config as a long value

Note: This variable is resolved just-in-time in the nextRow() method.


recursive

protected boolean recursive
The recursive given in data-config. Default value is false.


PLACE_HOLDER_PATTERN

public static final Pattern PLACE_HOLDER_PATTERN

DIR

public static final String DIR
See Also:
Constant Field Values

FILE

public static final String FILE
See Also:
Constant Field Values

ABSOLUTE_FILE

public static final String ABSOLUTE_FILE
See Also:
Constant Field Values

SIZE

public static final String SIZE
See Also:
Constant Field Values

LAST_MODIFIED

public static final String LAST_MODIFIED
See Also:
Constant Field Values

FILE_NAME

public static final String FILE_NAME
See Also:
Constant Field Values

BASE_DIR

public static final String BASE_DIR
See Also:
Constant Field Values

EXCLUDES

public static final String EXCLUDES
See Also:
Constant Field Values

NEWER_THAN

public static final String NEWER_THAN
See Also:
Constant Field Values

OLDER_THAN

public static final String OLDER_THAN
See Also:
Constant Field Values

BIGGER_THAN

public static final String BIGGER_THAN
See Also:
Constant Field Values

SMALLER_THAN

public static final String SMALLER_THAN
See Also:
Constant Field Values

RECURSIVE

public static final String RECURSIVE
See Also:
Constant Field Values
Constructor Detail

FileListEntityProcessor

public FileListEntityProcessor()
Method Detail

init

public void init(Context context)
Description copied from class: EntityProcessor
This method is called when it starts processing an entity. When it comes back to the entity it is called again. So it can reset anything at that point. For a rootmost entity this is called only once for an ingestion. For sub-entities , this is called multiple once for each row from its parent entity

Overrides:
init in class EntityProcessorBase
Parameters:
context - The current context

nextRow

public Map<String,Object> nextRow()
Description copied from class: EntityProcessorBase
For a simple implementation, this is the only method that the sub-class should implement. This is intended to stream rows one-by-one. Return null to signal end of rows

Overrides:
nextRow in class EntityProcessorBase
Returns:
a row where the key is the name of the field and value can be any Object or a Collection of objects. Return null to signal end of rows


Copyright © 2000-2013 Apache Software Foundation. All Rights Reserved.