public class NRTCachingDirectory extends Directory
RAMDirectory
around any provided delegate directory, to
be used during NRT search. Make sure you pull the merge
scheduler using getMergeScheduler() and pass that to your
IndexWriter; this class uses that to keep track of which
merges are being done by which threads, to decide when to
cache each written file.
This class is likely only useful in a near-real-time context, where indexing rate is lowish but reopen rate is highish, resulting in many tiny files being written. This directory keeps such segments (as well as the segments produced by merging them, as long as they are small enough), in RAM.
This is safe to use: when your app calls {IndexWriter#commit}, all cached files will be flushed from the cached and sync'd.
NOTE: this class is somewhat sneaky in its approach for spying on merges to determine the size of a merge: it records which threads are running which merges by watching ConcurrentMergeScheduler's doMerge method. While this works correctly, likely future versions of this class will take a more general approach.
Here's a simple example usage:
Directory fsDir = FSDirectory.open(new File("/path/to/index"));
NRTCachingDirectory cachedFSDir = new NRTCachingDirectory(fsDir, 5.0, 60.0);
IndexWriterConfig conf = new IndexWriterConfig(Version.LUCENE_32, analyzer);
conf.setMergeScheduler(cachedFSDir.getMergeScheduler());
IndexWriter writer = new IndexWriter(cachedFSDir, conf);
This will cache all newly flushed segments, all merges whose expected segment size is <= 5 MB, unless the net cached bytes exceeds 60 MB at which point all writes will not be cached (until the net bytes falls below 60 MB).
isOpen, lockFactory| Constructor and Description |
|---|
NRTCachingDirectory(Directory delegate,
double maxMergeSizeMB,
double maxCachedMB)
We will cache a newly created output if 1) it's a
flush or a merge and the estimated size of the merged segment is <=
maxMergeSizeMB, and 2) the total cached bytes is <=
maxCachedMB
|
| Modifier and Type | Method and Description |
|---|---|
void |
clearLock(String name)
Attempt to clear (forcefully unlock and remove) the
specified lock.
|
void |
close()
Close this directory, which flushes any cached files
to the delegate and then closes the delegate.
|
IndexOutput |
createOutput(String name)
Creates a new, empty file in the directory with the given name.
|
void |
deleteFile(String name)
Removes an existing file in the directory.
|
protected boolean |
doCacheWrite(String name)
Subclass can override this to customize logic; return
true if this file should be written to the RAMDirectory.
|
boolean |
fileExists(String name)
Returns true iff a file with the given name exists.
|
long |
fileLength(String name)
Returns the length of a file in the directory.
|
long |
fileModified(String name)
Returns the time the named file was last modified.
|
LockFactory |
getLockFactory()
Get the LockFactory that this Directory instance is
using for its locking implementation.
|
String |
getLockID()
Return a string identifier that uniquely differentiates
this Directory instance from other Directory instances.
|
MergeScheduler |
getMergeScheduler() |
String[] |
listAll()
Returns an array of strings, one for each file in the directory.
|
String[] |
listCachedFiles() |
Lock |
makeLock(String name)
Construct a
Lock. |
IndexInput |
openInput(String name)
Returns a stream reading an existing file.
|
IndexInput |
openInput(String name,
int bufferSize)
Returns a stream reading an existing file, with the
specified read buffer size.
|
void |
setLockFactory(LockFactory lf)
Set the LockFactory that this Directory instance should
use for its locking implementation.
|
long |
sizeInBytes()
Returns how many bytes are being used by the
RAMDirectory cache
|
void |
sync(Collection<String> fileNames)
Ensure that any writes to these files are moved to
stable storage.
|
String |
toString() |
void |
touchFile(String name)
Deprecated.
|
copy, copy, ensureOpen, syncpublic NRTCachingDirectory(Directory delegate, double maxMergeSizeMB, double maxCachedMB)
public LockFactory getLockFactory()
DirectorygetLockFactory in class Directorypublic void setLockFactory(LockFactory lf) throws IOException
DirectorysetLockFactory in class Directorylf - instance of LockFactory.IOExceptionpublic String getLockID()
Directorypublic Lock makeLock(String name)
DirectoryLock.public void clearLock(String name) throws IOException
DirectoryclearLock in class Directoryname - name of the lock to be cleared.IOExceptionpublic String[] listAll() throws IOException
DirectorylistAll in class DirectoryNoSuchDirectoryException - if the directory is not prepared for any
write operations (such as Directory.createOutput(String)).IOException - in case of other IO errorspublic long sizeInBytes()
public boolean fileExists(String name) throws IOException
DirectoryfileExists in class DirectoryIOExceptionpublic long fileModified(String name) throws IOException
DirectoryfileModified in class DirectoryIOException@Deprecated public void touchFile(String name) throws IOException
DirectorytouchFile in class DirectoryIOExceptionpublic void deleteFile(String name) throws IOException
DirectorydeleteFile in class DirectoryIOExceptionpublic long fileLength(String name) throws IOException
DirectoryFileNotFoundException if the file does not exist
fileLength in class Directoryname - the name of the file for which to return the length.FileNotFoundException - if the file does not exist.IOException - if there was an IO error while retrieving the file's
length.public String[] listCachedFiles()
public IndexOutput createOutput(String name) throws IOException
DirectorycreateOutput in class DirectoryIOExceptionpublic void sync(Collection<String> fileNames) throws IOException
Directorysync in class DirectoryIOExceptionpublic IndexInput openInput(String name) throws IOException
DirectoryopenInput in class DirectoryIOExceptionpublic IndexInput openInput(String name, int bufferSize) throws IOException
DirectoryFSDirectory and CompoundFileReader.openInput in class DirectoryIOExceptionpublic void close()
throws IOException
close in interface Closeableclose in class DirectoryIOExceptionpublic MergeScheduler getMergeScheduler()
protected boolean doCacheWrite(String name)