Class TaxonomyReader

java.lang.Object
org.apache.lucene.facet.taxonomy.TaxonomyReader
All Implemented Interfaces:
Closeable, AutoCloseable
Direct Known Subclasses:
DirectoryTaxonomyReader

public abstract class TaxonomyReader extends Object implements Closeable
TaxonomyReader is the read-only interface with which the faceted-search library uses the taxonomy during search time.

A TaxonomyReader holds a list of categories. Each category has a serial number which we call an "ordinal", and a hierarchical "path" name:

  • The ordinal is an integer that starts at 0 for the first category (which is always the root category), and grows contiguously as more categories are added; Note that once a category is added, it can never be deleted.
  • The path is a CategoryPath object specifying the category's position in the hierarchy.
Notes about concurrent access to the taxonomy:

An implementation must allow multiple readers to be active concurrently with a single writer. Readers follow so-called "point in time" semantics, i.e., a TaxonomyReader object will only see taxonomy entries which were available at the time it was created. What the writer writes is only available to (new) readers after the writer's commit() is called.

In faceted search, two separate indices are used: the main Lucene index, and the taxonomy. Because the main index refers to the categories listed in the taxonomy, it is important to open the taxonomy *after* opening the main index, and it is also necessary to reopen() the taxonomy after reopen()ing the main index.

This order is important, otherwise it would be possible for the main index to refer to a category which is not yet visible in the old snapshot of the taxonomy. Note that it is indeed fine for the taxonomy to be opened after the main index - even a long time after. The reason is that once a category is added to the taxonomy, it can never be changed or deleted, so there is no danger that a "too new" taxonomy not being consistent with an older index.

WARNING: This API is experimental and might change in incompatible ways in the next release.
  • Field Details

    • ROOT_ORDINAL

      public static final int ROOT_ORDINAL
      The root category (the category with the empty path) always has the ordinal 0, to which we give a name ROOT_ORDINAL. getOrdinal(FacetLabel) of an empty path will always return ROOT_ORDINAL, and getPath(int) with ROOT_ORDINAL will return the empty path.
      See Also:
    • INVALID_ORDINAL

      public static final int INVALID_ORDINAL
      Ordinals are always non-negative, so a negative ordinal can be used to signify an error. Methods here return INVALID_ORDINAL (-1) in this case.
      See Also:
  • Constructor Details

    • TaxonomyReader

      public TaxonomyReader()
      Sole constructor.
  • Method Details

    • openIfChanged

      public static <T extends TaxonomyReader> T openIfChanged(T oldTaxoReader) throws IOException
      If the taxonomy has changed since the provided reader was opened, open and return a new TaxonomyReader; else, return null. The new reader, if not null, will be the same type of reader as the one given to this method.

      This method is typically far less costly than opening a fully new TaxonomyReader as it shares resources with the provided TaxonomyReader, when possible.

      Throws:
      IOException
    • doClose

      protected abstract void doClose() throws IOException
      performs the actual task of closing the resources that are used by the taxonomy reader.
      Throws:
      IOException
    • doOpenIfChanged

      protected abstract TaxonomyReader doOpenIfChanged() throws IOException
      Implements the actual opening of a new TaxonomyReader instance if the taxonomy has changed.
      Throws:
      IOException
      See Also:
    • ensureOpen

      protected final void ensureOpen() throws AlreadyClosedException
      Throws AlreadyClosedException if this IndexReader is closed
      Throws:
      AlreadyClosedException
    • close

      public final void close() throws IOException
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
      Throws:
      IOException
    • decRef

      public final void decRef() throws IOException
      Expert: decreases the refCount of this TaxonomyReader instance. If the refCount drops to 0 this taxonomy reader is closed.
      Throws:
      IOException
    • getParallelTaxonomyArrays

      public abstract ParallelTaxonomyArrays getParallelTaxonomyArrays() throws IOException
      Returns a ParallelTaxonomyArrays object which can be used to efficiently traverse the taxonomy tree.
      Throws:
      IOException
    • getChildren

      public TaxonomyReader.ChildrenIterator getChildren(int ordinal) throws IOException
      Returns an iterator over the children of the given ordinal.
      Throws:
      IOException
    • getCommitUserData

      public abstract Map<String,String> getCommitUserData() throws IOException
      Retrieve user committed data.
      Throws:
      IOException
      See Also:
    • getOrdinal

      public abstract int getOrdinal(FacetLabel categoryPath) throws IOException
      Returns the ordinal of the category given as a path. The ordinal is the category's serial number, an integer which starts with 0 and grows as more categories are added (note that once a category is added, it can never be deleted).
      Returns:
      the category's ordinal or INVALID_ORDINAL if the category wasn't foun.
      Throws:
      IOException
    • getOrdinal

      public int getOrdinal(String dim, String... path) throws IOException
      Returns ordinal for the dim + path.
      Throws:
      IOException
    • getPath

      public abstract FacetLabel getPath(int ordinal) throws IOException
      Returns the path name of the category with the given ordinal.
      Throws:
      IOException
    • getBulkPath

      public FacetLabel[] getBulkPath(int... ordinals) throws IOException
      Returns the path names of the list of ordinals associated with different categories.

      The implementation in DirectoryTaxonomyReader is generally faster than the default implementation which iteratively calls getPath(int)

      Throws:
      IOException
    • getRefCount

      public final int getRefCount()
      Returns the current refCount for this taxonomy reader.
    • getSize

      public abstract int getSize()
      Returns the number of categories in the taxonomy. Note that the number of categories returned is often slightly higher than the number of categories inserted into the taxonomy; This is because when a category is added to the taxonomy, its ancestors are also added automatically (including the root, which always get ordinal 0).
    • incRef

      public final void incRef()
      Expert: increments the refCount of this TaxonomyReader instance. RefCounts can be used to determine when a taxonomy reader can be closed safely, i.e. as soon as there are no more references. Be sure to always call a corresponding decRef(), in a finally clause; otherwise the reader may never be closed.
    • tryIncRef

      public final boolean tryIncRef()
      Expert: increments the refCount of this TaxonomyReader instance only if it has not been closed yet. Returns true on success.