Class DirectoryTaxonomyReader
- java.lang.Object
-
- org.apache.lucene.facet.taxonomy.TaxonomyReader
-
- org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyReader
-
- All Implemented Interfaces:
Closeable
,AutoCloseable
,Accountable
public class DirectoryTaxonomyReader extends TaxonomyReader implements Accountable
ATaxonomyReader
which retrieves stored taxonomy information from aDirectory
.Reading from the on-disk index on every method call is too slow, so this implementation employs caching: Some methods cache recent requests and their results, while other methods prefetch all the data into memory and then provide answers directly from in-memory tables. See the documentation of individual methods for comments on their performance.
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.facet.taxonomy.TaxonomyReader
TaxonomyReader.ChildrenIterator
-
-
Field Summary
-
Fields inherited from class org.apache.lucene.facet.taxonomy.TaxonomyReader
INVALID_ORDINAL, ROOT_ORDINAL
-
Fields inherited from interface org.apache.lucene.util.Accountable
NULL_ACCOUNTABLE
-
-
Constructor Summary
Constructors Modifier Constructor Description DirectoryTaxonomyReader(DirectoryTaxonomyWriter taxoWriter)
Opens aDirectoryTaxonomyReader
over the givenDirectoryTaxonomyWriter
(for NRT).protected
DirectoryTaxonomyReader(DirectoryReader indexReader, DirectoryTaxonomyWriter taxoWriter, LRUHashMap<FacetLabel,Integer> ordinalCache, LRUHashMap<Integer,FacetLabel> categoryCache, org.apache.lucene.facet.taxonomy.directory.TaxonomyIndexArrays taxoArrays)
Expert: Use this method to explicitly force theDirectoryTaxonomyReader
to use specific parent/children arrays and caches.DirectoryTaxonomyReader(Directory directory)
Open for reading a taxonomy stored in a givenDirectory
.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
doClose()
performs the actual task of closing the resources that are used by the taxonomy reader.protected DirectoryTaxonomyReader
doOpenIfChanged()
Implements the opening of a newDirectoryTaxonomyReader
instance if the taxonomy has changed.FacetLabel[]
getBulkPath(int... ordinals)
Returns an array of FacetLabels for a given array of ordinals.Collection<Accountable>
getChildResources()
Map<String,String>
getCommitUserData()
Retrieve user committed data.protected DirectoryReader
getInternalIndexReader()
Expert: returns the underlyingDirectoryReader
instance that is used by thisTaxonomyReader
.int
getOrdinal(FacetLabel cp)
Returns the ordinal of the category given as a path.ParallelTaxonomyArrays
getParallelTaxonomyArrays()
Returns aParallelTaxonomyArrays
object which can be used to efficiently traverse the taxonomy tree.FacetLabel
getPath(int ordinal)
Returns the path name of the category with the given ordinal.int
getSize()
Returns the number of categories in the taxonomy.protected DirectoryReader
openIndexReader(IndexWriter writer)
Open theDirectoryReader
from thisIndexWriter
.protected DirectoryReader
openIndexReader(Directory directory)
Open theDirectoryReader
from thisDirectory
.long
ramBytesUsed()
void
setCacheSize(int size)
setCacheSize controls the maximum allowed size of each of the caches used bygetPath(int)
andgetOrdinal(FacetLabel)
.String
toString(int max)
Returns ordinal -> label mapping, up to the provided max ordinal or number of ordinals, whichever is smaller.-
Methods inherited from class org.apache.lucene.facet.taxonomy.TaxonomyReader
close, decRef, ensureOpen, getChildren, getOrdinal, getRefCount, incRef, openIfChanged, tryIncRef
-
-
-
-
Constructor Detail
-
DirectoryTaxonomyReader
protected DirectoryTaxonomyReader(DirectoryReader indexReader, DirectoryTaxonomyWriter taxoWriter, LRUHashMap<FacetLabel,Integer> ordinalCache, LRUHashMap<Integer,FacetLabel> categoryCache, org.apache.lucene.facet.taxonomy.directory.TaxonomyIndexArrays taxoArrays) throws IOException
Expert: Use this method to explicitly force theDirectoryTaxonomyReader
to use specific parent/children arrays and caches.Called from
doOpenIfChanged()
. If the taxonomy has been recreated, you should passnull
as the caches and parent/children arrays.- Parameters:
indexReader
- An indexReader that is opened in the desired DirectorytaxoWriter
- TheDirectoryTaxonomyWriter
from which to obtain newly added categories, in real-time.ordinalCache
- a FacetLabel to Integer ordinal mapping if it already existscategoryCache
- an ordinal to FacetLabel mapping if it already existstaxoArrays
- taxonomy arrays that store the parent, siblings, children information- Throws:
IOException
-
DirectoryTaxonomyReader
public DirectoryTaxonomyReader(Directory directory) throws IOException
Open for reading a taxonomy stored in a givenDirectory
.- Parameters:
directory
- TheDirectory
in which the taxonomy resides.- Throws:
CorruptIndexException
- if the Taxonomy is corrupt.IOException
- if another error occurred.
-
DirectoryTaxonomyReader
public DirectoryTaxonomyReader(DirectoryTaxonomyWriter taxoWriter) throws IOException
Opens aDirectoryTaxonomyReader
over the givenDirectoryTaxonomyWriter
(for NRT).- Parameters:
taxoWriter
- TheDirectoryTaxonomyWriter
from which to obtain newly added categories, in real-time.- Throws:
IOException
-
-
Method Detail
-
doClose
protected void doClose() throws IOException
Description copied from class:TaxonomyReader
performs the actual task of closing the resources that are used by the taxonomy reader.- Specified by:
doClose
in classTaxonomyReader
- Throws:
IOException
-
doOpenIfChanged
protected DirectoryTaxonomyReader doOpenIfChanged() throws IOException
Implements the opening of a newDirectoryTaxonomyReader
instance if the taxonomy has changed.NOTE: the returned
DirectoryTaxonomyReader
shares the ordinal and category caches with this reader. This is not expected to cause any issues, unless the two instances continue to live. The reader guarantees that the two instances cannot affect each other in terms of correctness of the caches, however if the size of the cache is changed throughsetCacheSize(int)
, it will affect both reader instances.- Specified by:
doOpenIfChanged
in classTaxonomyReader
- Throws:
IOException
- See Also:
TaxonomyReader.openIfChanged(TaxonomyReader)
-
openIndexReader
protected DirectoryReader openIndexReader(Directory directory) throws IOException
Open theDirectoryReader
from thisDirectory
.- Throws:
IOException
-
openIndexReader
protected DirectoryReader openIndexReader(IndexWriter writer) throws IOException
Open theDirectoryReader
from thisIndexWriter
.- Throws:
IOException
-
getInternalIndexReader
protected DirectoryReader getInternalIndexReader()
Expert: returns the underlyingDirectoryReader
instance that is used by thisTaxonomyReader
.
-
getParallelTaxonomyArrays
public ParallelTaxonomyArrays getParallelTaxonomyArrays() throws IOException
Description copied from class:TaxonomyReader
Returns aParallelTaxonomyArrays
object which can be used to efficiently traverse the taxonomy tree.- Specified by:
getParallelTaxonomyArrays
in classTaxonomyReader
- Throws:
IOException
-
getCommitUserData
public Map<String,String> getCommitUserData() throws IOException
Description copied from class:TaxonomyReader
Retrieve user committed data.- Specified by:
getCommitUserData
in classTaxonomyReader
- Throws:
IOException
- See Also:
TaxonomyWriter.setLiveCommitData(Iterable)
-
getOrdinal
public int getOrdinal(FacetLabel cp) throws IOException
Description copied from class:TaxonomyReader
Returns the ordinal of the category given as a path. The ordinal is the category's serial number, an integer which starts with 0 and grows as more categories are added (note that once a category is added, it can never be deleted).- Specified by:
getOrdinal
in classTaxonomyReader
- Returns:
- the category's ordinal or
TaxonomyReader.INVALID_ORDINAL
if the category wasn't foun. - Throws:
IOException
-
getPath
public FacetLabel getPath(int ordinal) throws IOException
Description copied from class:TaxonomyReader
Returns the path name of the category with the given ordinal.- Specified by:
getPath
in classTaxonomyReader
- Throws:
IOException
-
getBulkPath
public FacetLabel[] getBulkPath(int... ordinals) throws IOException
Returns an array of FacetLabels for a given array of ordinals.This API is generally faster than iteratively calling
getPath(int)
over an array of ordinals. It uses thegetPath(int)
method iteratively when it detects that the index was created using StoredFields (with no performance gains) and uses DocValues based iteration when the index is based on BinaryDocValues. Lucene switched to BinaryDocValues in version 9.0- Overrides:
getBulkPath
in classTaxonomyReader
- Parameters:
ordinals
- Array of ordinals that are assigned to categories inserted into the taxonomy index- Throws:
IOException
-
getSize
public int getSize()
Description copied from class:TaxonomyReader
Returns the number of categories in the taxonomy. Note that the number of categories returned is often slightly higher than the number of categories inserted into the taxonomy; This is because when a category is added to the taxonomy, its ancestors are also added automatically (including the root, which always get ordinal 0).- Specified by:
getSize
in classTaxonomyReader
-
ramBytesUsed
public long ramBytesUsed()
- Specified by:
ramBytesUsed
in interfaceAccountable
-
getChildResources
public Collection<Accountable> getChildResources()
- Specified by:
getChildResources
in interfaceAccountable
-
setCacheSize
public void setCacheSize(int size)
setCacheSize controls the maximum allowed size of each of the caches used bygetPath(int)
andgetOrdinal(FacetLabel)
.Currently, if the given size is smaller than the current size of a cache, it will not shrink, and rather we be limited to its current size.
- Parameters:
size
- the new maximum cache size, in number of entries.
-
toString
public String toString(int max)
Returns ordinal -> label mapping, up to the provided max ordinal or number of ordinals, whichever is smaller.
-
-