org.apache.lucene.search.LRUQueryCache

All Implemented Interfaces:: QueryCache, Accountable

public class LRUQueryCache extends Object implements QueryCache, Accountable

A QueryCache that evicts queries using a LRU (least-recently-used) eviction policy in order to remain under a given maximum size and number of bytes used.

This class is thread-safe.

Note that query eviction runs in linear time with the total number of segments that have cache entries so this cache works best with caching policies that only cache on "large" segments, and it is advised to not share this cache across too many indices.

A default query cache and policy instance is used in IndexSearcher. If you want to replace those defaults it is typically done like this:

   final int maxNumberOfCachedQueries = 256;
   final long maxRamBytesUsed = 50 * 1024L * 1024L; // 50MB
   // these cache and policy instances can be shared across several queries and readers
   // it is fine to eg. store them into static variables
   final QueryCache queryCache = new LRUQueryCache(maxNumberOfCachedQueries, maxRamBytesUsed);
   final QueryCachingPolicy defaultCachingPolicy = new UsageTrackingQueryCachingPolicy();
   indexSearcher.setQueryCache(queryCache);
   indexSearcher.setQueryCachingPolicy(defaultCachingPolicy);

This cache exposes some global statistics (hit count, miss count, number of cache entries, total number of DocIdSets that have ever been cached, number of evicted entries). In case you would like to have more fine-grained statistics, such as per-index or per-query-class statistics, it is possible to override various callbacks: onHit(java.lang.Object, org.apache.lucene.search.Query), onMiss(java.lang.Object, org.apache.lucene.search.Query), onQueryCache(org.apache.lucene.search.Query, long), onQueryEviction(org.apache.lucene.search.Query, long), onDocIdSetCache(java.lang.Object, long), onDocIdSetEviction(java.lang.Object, int, long) and onClear(). It is better to not perform heavy computations in these methods though since they are called synchronously and under a lock.

See Also:

QueryCachingPolicy

WARNING: This API is experimental and might change in incompatible ways in the next release.

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

protected static class

LRUQueryCache.CacheAndCount

Cache of doc ids with a count.
Field Summary

Fields inherited from interface org.apache.lucene.util.Accountable
NULL_ACCOUNTABLE
Constructor Summary

Constructors

Constructor

Description

LRUQueryCache(int maxSize, long maxRamBytesUsed)

Create a new instance that will cache at most maxSize queries with at most maxRamBytesUsed bytes of memory.

LRUQueryCache(int maxSize, long maxRamBytesUsed, Predicate<LeafReaderContext> leavesToCache, float skipCacheFactor)

Expert: Create a new instance that will cache at most maxSize queries with at most maxRamBytesUsed bytes of memory, only on leaves that satisfy leavesToCache.
Method Summary

Modifier and Type

Method

Description

protected LRUQueryCache.CacheAndCount

cacheImpl(BulkScorer scorer, int maxDoc)

Default cache implementation: uses RoaringDocIdSet for sets that have a density < 1% and a BitDocIdSet over a FixedBitSet otherwise.

void

clear()

Clear the content of this cache.

void

clearCoreCacheKey(Object coreKey)

Remove all cache entries for the given core cache key.

void

clearQuery(Query query)

Remove all cache entries for the given query.

Weight

doCache(Weight weight, QueryCachingPolicy policy)

Return a wrapper around the provided weight that will cache matching docs per-segment accordingly to the given policy.

final long

getCacheCount()

Return the total number of cache entries that have been generated and put in the cache.

final long

getCacheSize()

Return the total number of DocIdSets which are currently stored in the cache.

Collection<Accountable>

getChildResources()

Returns nested resources of this class.

final long

getEvictionCount()

Return the number of cache entries that have been removed from the cache either in order to stay under the maximum configured size/ram usage, or because a segment has been closed.

final long

getHitCount()

Over the total number of times that a query has been looked up, return how many times a cached DocIdSet has been found and returned.

final long

getMissCount()

Over the total number of times that a query has been looked up, return how many times this query was not contained in the cache.

final long

getTotalCount()

Return the total number of times that a Query has been looked up in this QueryCache.

protected void

onClear()

Expert: callback when the cache is completely cleared.

protected void

onDocIdSetCache(Object readerCoreKey, long ramBytesUsed)

Expert: callback when a DocIdSet is added to this cache.

protected void

onDocIdSetEviction(Object readerCoreKey, int numEntries, long sumRamBytesUsed)

Expert: callback when one or more DocIdSets are removed from this cache.

protected void

onHit(Object readerCoreKey, Query query)

Expert: callback when there is a cache hit on a given query.

protected void

onMiss(Object readerCoreKey, Query query)

Expert: callback when there is a cache miss on a given query.

protected void

onQueryCache(Query query, long ramBytesUsed)

Expert: callback when a query is added to this cache.

protected void

onQueryEviction(Query query, long ramBytesUsed)

Expert: callback when a query is evicted from this cache.

long

ramBytesUsed()

Return the memory usage of this object in bytes.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- LRUQueryCache
  
  public LRUQueryCache(int maxSize, long maxRamBytesUsed, Predicate<LeafReaderContext> leavesToCache, float skipCacheFactor)
  
  Expert: Create a new instance that will cache at most maxSize queries with at most maxRamBytesUsed bytes of memory, only on leaves that satisfy leavesToCache.
  Also, clauses whose cost is skipCacheFactor times more than the cost of the top-level query will not be cached in order to not slow down queries too much.
- LRUQueryCache
  
  public LRUQueryCache(int maxSize, long maxRamBytesUsed)
  
  Create a new instance that will cache at most maxSize queries with at most maxRamBytesUsed bytes of memory. Queries will only be cached on leaves that have more than 10k documents and have more than 3% of the total number of documents in the index. This should guarantee that all leaves from the upper tier will be cached while ensuring that at most 33 leaves can make it to the cache (very likely less than 10 in practice), which is useful for this implementation since some operations perform in linear time with the number of cached leaves. Only clauses whose cost is at most 100x the cost of the top-level query will be cached in order to not hurt latency too much because of caching.
Method Details
- onHit
  
  protected void onHit(Object readerCoreKey, Query query)
  
  Expert: callback when there is a cache hit on a given query. Implementing this method is typically useful in order to compute more fine-grained statistics about the query cache.
  See Also:
  
  onMiss(java.lang.Object, org.apache.lucene.search.Query)
  
  WARNING: This API is experimental and might change in incompatible ways in the next release.
- onMiss
  
  protected void onMiss(Object readerCoreKey, Query query)
  
  Expert: callback when there is a cache miss on a given query.
  See Also:
  
  onHit(java.lang.Object, org.apache.lucene.search.Query)
  
  WARNING: This API is experimental and might change in incompatible ways in the next release.
- onQueryCache
  
  protected void onQueryCache(Query query, long ramBytesUsed)
  
  Expert: callback when a query is added to this cache. Implementing this method is typically useful in order to compute more fine-grained statistics about the query cache.
  See Also:
  
  onQueryEviction(org.apache.lucene.search.Query, long)
  
  WARNING: This API is experimental and might change in incompatible ways in the next release.
- onQueryEviction
  
  protected void onQueryEviction(Query query, long ramBytesUsed)
  
  Expert: callback when a query is evicted from this cache.
  See Also:
  
  onQueryCache(org.apache.lucene.search.Query, long)
  
  WARNING: This API is experimental and might change in incompatible ways in the next release.
- onDocIdSetCache
  
  protected void onDocIdSetCache(Object readerCoreKey, long ramBytesUsed)
  
  Expert: callback when a DocIdSet is added to this cache. Implementing this method is typically useful in order to compute more fine-grained statistics about the query cache.
  See Also:
  
  onDocIdSetEviction(java.lang.Object, int, long)
  
  WARNING: This API is experimental and might change in incompatible ways in the next release.
- onDocIdSetEviction
  
  protected void onDocIdSetEviction(Object readerCoreKey, int numEntries, long sumRamBytesUsed)
  
  Expert: callback when one or more DocIdSets are removed from this cache.
  See Also:
  
  onDocIdSetCache(java.lang.Object, long)
  
  WARNING: This API is experimental and might change in incompatible ways in the next release.
- onClear
  
  protected void onClear()
  
  Expert: callback when the cache is completely cleared.
  
  WARNING: This API is experimental and might change in incompatible ways in the next release.
- clearCoreCacheKey
  
  public void clearCoreCacheKey(Object coreKey)
  
  Remove all cache entries for the given core cache key.
- clearQuery
  
  public void clearQuery(Query query)
  
  Remove all cache entries for the given query.
- clear
  
  public void clear()
  
  Clear the content of this cache.
- doCache
  
  public Weight doCache(Weight weight, QueryCachingPolicy policy)
  
  Description copied from interface: QueryCache
  
  Return a wrapper around the provided weight that will cache matching docs per-segment accordingly to the given policy. NOTE: The returned weight will only be equivalent if scores are not needed.
  Specified by:
  
  doCache in interface QueryCache
  
  See Also:
  
  Collector.scoreMode()
- ramBytesUsed
  
  public long ramBytesUsed()
  
  Description copied from interface: Accountable
  
  Return the memory usage of this object in bytes. Negative values are illegal.
  
  Specified by:
  
  ramBytesUsed in interface Accountable
- getChildResources
  
  public Collection<Accountable> getChildResources()
  
  Description copied from interface: Accountable
  
  Returns nested resources of this class. The result should be a point-in-time snapshot (to avoid race conditions).
  Specified by:
  
  getChildResources in interface Accountable
  
  See Also:
  
  Accountables
- cacheImpl
  
  protected LRUQueryCache.CacheAndCount cacheImpl(BulkScorer scorer, int maxDoc) throws IOException
  
  Default cache implementation: uses RoaringDocIdSet for sets that have a density < 1% and a BitDocIdSet over a FixedBitSet otherwise.
  
  Throws:
  
  IOException
- getTotalCount
  
  public final long getTotalCount()
  
  Return the total number of times that a Query has been looked up in this QueryCache. Note that this number is incremented once per segment so running a cached query only once will increment this counter by the number of segments that are wrapped by the searcher. Note that by definition, getTotalCount() is the sum of getHitCount() and getMissCount().
  See Also:
  
  getHitCount()
  
  getMissCount()
- getHitCount
  
  public final long getHitCount()
  
  Over the total number of times that a query has been looked up, return how many times a cached DocIdSet has been found and returned.
  See Also:
  
  getTotalCount()
  
  getMissCount()
- getMissCount
  
  public final long getMissCount()
  
  Over the total number of times that a query has been looked up, return how many times this query was not contained in the cache.
  See Also:
  
  getTotalCount()
  
  getHitCount()
- getCacheSize
  
  public final long getCacheSize()
  
  Return the total number of DocIdSets which are currently stored in the cache.
  See Also:
  
  getCacheCount()
  
  getEvictionCount()
- getCacheCount
  
  public final long getCacheCount()
  
  Return the total number of cache entries that have been generated and put in the cache. It is highly desirable to have a hit count that is much higher than the cache count as the opposite would indicate that the query cache makes efforts in order to cache queries but then they do not get reused.
  See Also:
  
  getCacheSize()
  
  getEvictionCount()
- getEvictionCount
  
  public final long getEvictionCount()
  
  Return the number of cache entries that have been removed from the cache either in order to stay under the maximum configured size/ram usage, or because a segment has been closed. High numbers of evictions might mean that queries are not reused or that the caching policy caches too aggressively on NRT segments which get merged early.
  See Also:
  
  getCacheCount()
  
  getCacheSize()

Class LRUQueryCache

Nested Class Summary

Field Summary

Fields inherited from interface org.apache.lucene.util.Accountable

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

LRUQueryCache

LRUQueryCache

Method Details

onHit

onMiss

onQueryCache

onQueryEviction

onDocIdSetCache

onDocIdSetEviction

onClear

clearCoreCacheKey

clearQuery

clear

doCache

ramBytesUsed

getChildResources

cacheImpl

getTotalCount

getHitCount

getMissCount

getCacheSize

getCacheCount

getEvictionCount