Class MMapDirectory
- All Implemented Interfaces:
Closeable
,AutoCloseable
Directory
implementation that uses mmap for reading, and FSDirectory.FSIndexOutput
for writing.
NOTE: memory mapping uses up a portion of the virtual memory address space in your
process equal to the size of the file being mapped. Before using this class, be sure your have
plenty of virtual address space, e.g. by using a 64 bit JRE, or a 32 bit JRE with indexes that
are guaranteed to fit within the address space. On 32 bit platforms also consult MMapDirectory(Path, LockFactory, long)
if you have problems with mmap failing because of
fragmented address space. If you get an IOException
about mapping failed, it is
recommended to reduce the chunk size, until it works.
This class supports preloading files into physical memory upon opening. This can help improve
performance of searches on a cold page cache at the expense of slowing down opening an index. See
setPreload(BiPredicate)
for more details.
This class supports grouping of files that are part of the same logical group. This is a hint
that allows for better handling of resources. For example, individual files that are part of the
same segment can be considered part of the same logical group. See setGroupingFunction(Function)
for more details.
This class will use the modern MemorySegment
PREVIEW API available since
Java 21 which allows to safely unmap previously mmapped files after closing the IndexInput
s. There is no need to enable the "preview feature" of your Java version; it works out
of box with some compilation tricks. For more information about the foreign memory API read
documentation of the java.lang.foreign
package.
On some platforms like Linux and MacOS X, this class will invoke the syscall madvise()
to advise how OS kernel should handle paging after opening a file. For this to work, Java code
must be able to call native code. If this is not allowed, a warning is logged. To enable native
access for Lucene in a modularized application, pass
--enable-native-access=org.apache.lucene.core
to the Java command line. If Lucene is running in
a classpath-based application, use --enable-native-access=ALL-UNNAMED
.
NOTE: Accessing this class either directly or indirectly from a thread while it's
interrupted can close the underlying channel immediately if at the same time the thread is
blocked on IO. The channel will remain closed and subsequent access to MMapDirectory
will
throw a ClosedChannelException
. If your application uses either Thread.interrupt()
or Future.cancel(boolean)
you should use the legacy
RAFDirectory
from the Lucene misc
module in favor of MMapDirectory
.
NOTE: If your application requires external synchronization, you should not
synchronize on the MMapDirectory
instance as this may cause deadlock; use your own
(non-Lucene) objects instead.
- See Also:
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final BiPredicate
<String, IOContext> Argument forsetPreload(BiPredicate)
that configures all files to be preloaded upon opening them.static final BiPredicate
<String, IOContext> Argument forsetPreload(BiPredicate)
that configures files to be preloaded upon opening them if they use theReadAdvice.RANDOM_PRELOAD
advice.static final long
Default max chunk size: 16 GiBytes for 64 bit JVMs 256 MiBytes for 32 bit JVMsArgument forsetGroupingFunction(Function)
that configures grouping by segment.static final BiPredicate
<String, IOContext> Argument forsetPreload(BiPredicate)
that configures no files to be preloaded upon opening them.Argument forsetGroupingFunction(Function)
that configures no grouping.static final String
This sysprop allows to control the total maximum number of mmapped files that can be associated with a single sharedforeign Arena
PREVIEW.Fields inherited from class org.apache.lucene.store.FSDirectory
directory
Fields inherited from class org.apache.lucene.store.BaseDirectory
isOpen, lockFactory
-
Constructor Summary
ConstructorsConstructorDescriptionMMapDirectory
(Path path) Create a new MMapDirectory for the named location andFSLockFactory.getDefault()
.MMapDirectory
(Path path, long maxChunkSize) Create a new MMapDirectory for the named location andFSLockFactory.getDefault()
.MMapDirectory
(Path path, LockFactory lockFactory) Create a new MMapDirectory for the named location.MMapDirectory
(Path path, LockFactory lockFactory, long maxChunkSize) Create a new MMapDirectory for the named location, specifying the maximum chunk size used for memory mapping. -
Method Summary
Modifier and TypeMethodDescriptionfinal long
Returns the current mmap chunk size.Creates an IndexInput for the file with the given name.void
setGroupingFunction
(Function<String, Optional<String>> groupingFunction) Configures a grouping function for files that are part of the same logical group.void
setPreload
(BiPredicate<String, IOContext> preload) Configure which files to preload in physical memory upon opening.static boolean
Returns true, if MMapDirectory uses the platform'smadvise()
syscall to advise how OS kernel should handle paging after opening a file.Methods inherited from class org.apache.lucene.store.FSDirectory
close, createOutput, createTempOutput, deleteFile, deletePendingFiles, ensureCanRead, fileLength, fsync, getDirectory, getPendingDeletions, listAll, listAll, open, open, rename, sync, syncMetaData, toString
Methods inherited from class org.apache.lucene.store.BaseDirectory
ensureOpen, obtainLock
Methods inherited from class org.apache.lucene.store.Directory
copyFrom, getTempFileName, openChecksumInput
-
Field Details
-
ALL_FILES
Argument forsetPreload(BiPredicate)
that configures all files to be preloaded upon opening them. -
NO_FILES
Argument forsetPreload(BiPredicate)
that configures no files to be preloaded upon opening them. -
SHARED_ARENA_MAX_PERMITS_SYSPROP
This sysprop allows to control the total maximum number of mmapped files that can be associated with a single sharedforeign Arena
PREVIEW. For example, to set the max number of permits to 256, pass the following on the command line pass-Dorg.apache.lucene.store.MMapDirectory.sharedArenaMaxPermits=256
. Setting a value of 1 associates one file to one shared arena.- See Also:
- NOTE: This API is for internal purposes only and might change in incompatible ways in the next release.
-
NO_GROUPING
Argument forsetGroupingFunction(Function)
that configures no grouping. -
GROUP_BY_SEGMENT
Argument forsetGroupingFunction(Function)
that configures grouping by segment. -
BASED_ON_LOAD_IO_CONTEXT
Argument forsetPreload(BiPredicate)
that configures files to be preloaded upon opening them if they use theReadAdvice.RANDOM_PRELOAD
advice. -
DEFAULT_MAX_CHUNK_SIZE
public static final long DEFAULT_MAX_CHUNK_SIZEDefault max chunk size:- 16 GiBytes for 64 bit JVMs
- 256 MiBytes for 32 bit JVMs
-
-
Constructor Details
-
MMapDirectory
Create a new MMapDirectory for the named location. The directory is created at the named location if it does not yet exist.- Parameters:
path
- the path of the directorylockFactory
- the lock factory to use- Throws:
IOException
- if there is a low-level I/O error
-
MMapDirectory
Create a new MMapDirectory for the named location andFSLockFactory.getDefault()
. The directory is created at the named location if it does not yet exist.- Parameters:
path
- the path of the directory- Throws:
IOException
- if there is a low-level I/O error
-
MMapDirectory
Create a new MMapDirectory for the named location andFSLockFactory.getDefault()
. The directory is created at the named location if it does not yet exist.- Parameters:
path
- the path of the directorymaxChunkSize
- maximum chunk size (for default seeDEFAULT_MAX_CHUNK_SIZE
) used for memory mapping.- Throws:
IOException
- if there is a low-level I/O error
-
MMapDirectory
Create a new MMapDirectory for the named location, specifying the maximum chunk size used for memory mapping. The directory is created at the named location if it does not yet exist.Especially on 32 bit platform, the address space can be very fragmented, so large index files cannot be mapped. Using a lower chunk size makes the directory implementation a little bit slower (as the correct chunk may be resolved on lots of seeks) but the chance is higher that mmap does not fail. On 64 bit Java platforms, this parameter should always be large (like 1 GiBytes, or even larger with recent Java versions), as the address space is big enough. If it is larger, fragmentation of address space increases, but number of file handles and mappings is lower for huge installations with many open indexes.
Please note: The chunk size is always rounded down to a power of 2.
- Parameters:
path
- the path of the directorylockFactory
- the lock factory to use, or null for the default (NativeFSLockFactory
);maxChunkSize
- maximum chunk size (for default seeDEFAULT_MAX_CHUNK_SIZE
) used for memory mapping.- Throws:
IOException
- if there is a low-level I/O error
-
-
Method Details
-
setPreload
Configure which files to preload in physical memory upon opening. The default implementation does not preload anything. The behavior is best effort and operating system-dependent.- Parameters:
preload
- aBiPredicate
whose first argument is the file name, and second argument is theIOContext
used to open the file- See Also:
-
setGroupingFunction
Configures a grouping function for files that are part of the same logical group. The gathering of files into a logical group is a hint that allows for better handling of resources.By default, grouping is
GROUP_BY_SEGMENT
. To disable, invoke this method withNO_GROUPING
.- Parameters:
groupingFunction
- a function that accepts a file name and returns an optional group key. If the optional is present, then its value is the logical group to which the file belongs. Otherwise, the file name if not associated with any logical group.
-
getMaxChunkSize
public final long getMaxChunkSize()Returns the current mmap chunk size.- See Also:
-
openInput
Creates an IndexInput for the file with the given name.- Specified by:
openInput
in classDirectory
- Parameters:
name
- the name of an existing file.- Throws:
IOException
- in case of I/O error
-
supportsMadvise
public static boolean supportsMadvise()Returns true, if MMapDirectory uses the platform'smadvise()
syscall to advise how OS kernel should handle paging after opening a file.
-