Class HdfsTransactionLog

  • All Implemented Interfaces:
    Closeable, AutoCloseable

    public class HdfsTransactionLog
    extends TransactionLog
    Log Format: List{Operation, Version, ...} ADD, VERSION, DOC DELETE, VERSION, ID_BYTES DELETE_BY_QUERY, VERSION, String TODO: keep two files, one for [operation, version, id] and the other for the actual document data. That way we could throw away document log files more readily while retaining the smaller operation log files longer (and we can retrieve the stored fields from the latest documents from the index). This would require keeping all source fields stored of course. This would also allow to not log document data for requests with commit=true in them (since we know that if the request succeeds, all docs will be committed)