Class MMapDirectory

All Implemented Interfaces:
Closeable, AutoCloseable

public class MMapDirectory extends FSDirectory
File-based Directory implementation that uses mmap for reading, and FSDirectory.FSIndexOutput for writing.

NOTE: memory mapping uses up a portion of the virtual memory address space in your process equal to the size of the file being mapped. Before using this class, be sure your have plenty of virtual address space, e.g. by using a 64 bit JRE, or a 32 bit JRE with indexes that are guaranteed to fit within the address space. On 32 bit platforms also consult MMapDirectory(Path, LockFactory, long) if you have problems with mmap failing because of fragmented address space. If you get an IOException about mapping failed, it is recommended to reduce the chunk size, until it works.

This class supports preloading files into physical memory upon opening. This can help improve performance of searches on a cold page cache at the expense of slowing down opening an index. See setPreload(BiPredicate) for more details.

This class supports grouping of files that are part of the same logical group. This is a hint that allows for better handling of resources. For example, individual files that are part of the same segment can be considered part of the same logical group. See setGroupingFunction(Function) for more details.

This class will use the modern MemorySegmentPREVIEW API available since Java 21 which allows to safely unmap previously mmapped files after closing the IndexInputs. There is no need to enable the "preview feature" of your Java version; it works out of box with some compilation tricks. For more information about the foreign memory API read documentation of the java.lang.foreign package.

On some platforms like Linux and MacOS X, this class will invoke the syscall madvise() to advise how OS kernel should handle paging after opening a file. For this to work, Java code must be able to call native code. If this is not allowed, a warning is logged. To enable native access for Lucene in a modularized application, pass --enable-native-access=org.apache.lucene.core to the Java command line. If Lucene is running in a classpath-based application, use --enable-native-access=ALL-UNNAMED.

NOTE: Accessing this class either directly or indirectly from a thread while it's interrupted can close the underlying channel immediately if at the same time the thread is blocked on IO. The channel will remain closed and subsequent access to MMapDirectory will throw a ClosedChannelException. If your application uses either Thread.interrupt() or Future.cancel(boolean) you should use the legacy RAFDirectory from the Lucene misc module in favor of MMapDirectory.

NOTE: If your application requires external synchronization, you should not synchronize on the MMapDirectory instance as this may cause deadlock; use your own (non-Lucene) objects instead.

See Also:
  • Field Details

  • Constructor Details

    • MMapDirectory

      public MMapDirectory(Path path, LockFactory lockFactory) throws IOException
      Create a new MMapDirectory for the named location. The directory is created at the named location if it does not yet exist.
      Parameters:
      path - the path of the directory
      lockFactory - the lock factory to use
      Throws:
      IOException - if there is a low-level I/O error
    • MMapDirectory

      public MMapDirectory(Path path) throws IOException
      Create a new MMapDirectory for the named location and FSLockFactory.getDefault(). The directory is created at the named location if it does not yet exist.
      Parameters:
      path - the path of the directory
      Throws:
      IOException - if there is a low-level I/O error
    • MMapDirectory

      public MMapDirectory(Path path, long maxChunkSize) throws IOException
      Create a new MMapDirectory for the named location and FSLockFactory.getDefault(). The directory is created at the named location if it does not yet exist.
      Parameters:
      path - the path of the directory
      maxChunkSize - maximum chunk size (for default see DEFAULT_MAX_CHUNK_SIZE) used for memory mapping.
      Throws:
      IOException - if there is a low-level I/O error
    • MMapDirectory

      public MMapDirectory(Path path, LockFactory lockFactory, long maxChunkSize) throws IOException
      Create a new MMapDirectory for the named location, specifying the maximum chunk size used for memory mapping. The directory is created at the named location if it does not yet exist.

      Especially on 32 bit platform, the address space can be very fragmented, so large index files cannot be mapped. Using a lower chunk size makes the directory implementation a little bit slower (as the correct chunk may be resolved on lots of seeks) but the chance is higher that mmap does not fail. On 64 bit Java platforms, this parameter should always be large (like 1 GiBytes, or even larger with recent Java versions), as the address space is big enough. If it is larger, fragmentation of address space increases, but number of file handles and mappings is lower for huge installations with many open indexes.

      Please note: The chunk size is always rounded down to a power of 2.

      Parameters:
      path - the path of the directory
      lockFactory - the lock factory to use, or null for the default (NativeFSLockFactory);
      maxChunkSize - maximum chunk size (for default see DEFAULT_MAX_CHUNK_SIZE) used for memory mapping.
      Throws:
      IOException - if there is a low-level I/O error
  • Method Details

    • setPreload

      public void setPreload(BiPredicate<String,IOContext> preload)
      Configure which files to preload in physical memory upon opening. The default implementation does not preload anything. The behavior is best effort and operating system-dependent.
      Parameters:
      preload - a BiPredicate whose first argument is the file name, and second argument is the IOContext used to open the file
      See Also:
    • setGroupingFunction

      public void setGroupingFunction(Function<String,Optional<String>> groupingFunction)
      Configures a grouping function for files that are part of the same logical group. The gathering of files into a logical group is a hint that allows for better handling of resources.

      By default, grouping is GROUP_BY_SEGMENT. To disable, invoke this method with NO_GROUPING.

      Parameters:
      groupingFunction - a function that accepts a file name and returns an optional group key. If the optional is present, then its value is the logical group to which the file belongs. Otherwise, the file name if not associated with any logical group.
    • getMaxChunkSize

      public final long getMaxChunkSize()
      Returns the current mmap chunk size.
      See Also:
    • openInput

      public IndexInput openInput(String name, IOContext context) throws IOException
      Creates an IndexInput for the file with the given name.
      Specified by:
      openInput in class Directory
      Parameters:
      name - the name of an existing file.
      Throws:
      IOException - in case of I/O error
    • supportsMadvise

      public static boolean supportsMadvise()
      Returns true, if MMapDirectory uses the platform's madvise() syscall to advise how OS kernel should handle paging after opening a file.