Class FileListEntityProcessor


  • public class FileListEntityProcessor
    extends EntityProcessorBase

    An EntityProcessor instance which can stream file names found in a given base directory matching patterns and returning rows containing file information.

    It supports querying a give base directory by matching:

    • regular expressions to file names
    • excluding certain files based on regular expression
    • last modification date (newer or older than a given date or time)
    • size (bigger or smaller than size given in bytes)
    • recursively iterating through sub-directories
    Its output can be used along with FileDataSource to read from files in file systems.

    Refer to http://wiki.apache.org/solr/DataImportHandler for more details.

    This API is experimental and may change in the future.

    Since:
    solr 1.3
    See Also:
    Pattern
    • Constructor Detail

      • FileListEntityProcessor

        public FileListEntityProcessor()
    • Method Detail

      • init

        public void init​(Context context)
        Description copied from class: EntityProcessor
        This method is called when it starts processing an entity. When it comes back to the entity it is called again. So it can reset anything at that point. For a rootmost entity this is called only once for an ingestion. For sub-entities , this is called multiple once for each row from its parent entity
        Overrides:
        init in class EntityProcessorBase
        Parameters:
        context - The current context
      • nextRow

        public Map<String,​Object> nextRow()
        Description copied from class: EntityProcessorBase
        For a simple implementation, this is the only method that the sub-class should implement. This is intended to stream rows one-by-one. Return null to signal end of rows
        Overrides:
        nextRow in class EntityProcessorBase
        Returns:
        a row where the key is the name of the field and value can be any Object or a Collection of objects. Return null to signal end of rows