public class Cl2oTaxonomyWriterCache extends Object implements TaxonomyWriterCache
TaxonomyWriterCache
using CompactLabelToOrdinal
. Although
called cache, it maintains in memory all the mappings from category to
ordinal, relying on that CompactLabelToOrdinal
is an efficient
mapping for this purpose.Constructor and Description |
---|
Cl2oTaxonomyWriterCache(int initialCapcity,
float loadFactor,
int numHashArrays) |
Modifier and Type | Method and Description |
---|---|
void |
close()
Let go of whatever resources the cache is holding.
|
int |
get(CategoryPath categoryPath)
Lookup a category in the cache, returning its ordinal, or a negative
number if the category is not in the cache.
|
int |
get(CategoryPath categoryPath,
int length)
Like
TaxonomyWriterCache.get(CategoryPath) , but for a given prefix of the
category path. |
int |
getMemoryUsage()
Returns the number of bytes in memory used by this object.
|
boolean |
hasRoom(int n)
Sometimes the cache is either unlimited in size, or limited by a very
big size, and in that case when we add a lot of categories it might
make sense to pre-load the cache with all the existing categories.
|
boolean |
put(CategoryPath categoryPath,
int ordinal)
Add a category to the cache, with the given ordinal as the value.
|
boolean |
put(CategoryPath categoryPath,
int prefixLen,
int ordinal)
Like
TaxonomyWriterCache.put(CategoryPath, int) , but for a given prefix of the
category path. |
public Cl2oTaxonomyWriterCache(int initialCapcity, float loadFactor, int numHashArrays)
public void close()
TaxonomyWriterCache
close
in interface TaxonomyWriterCache
public boolean hasRoom(int n)
TaxonomyWriterCache
After hasRoom(n) returned true
, the following n put()
should return false (meaning that the cache was not cleared).
hasRoom
in interface TaxonomyWriterCache
public int get(CategoryPath categoryPath)
TaxonomyWriterCache
It is up to the caller to remember what a negative response means: If the caller knows the cache is complete (it was initially fed with all the categories, and since then put() never returned true) it means the category does not exist. Otherwise, the category might still exist, but just be missing from the cache.
get
in interface TaxonomyWriterCache
public int get(CategoryPath categoryPath, int length)
TaxonomyWriterCache
TaxonomyWriterCache.get(CategoryPath)
, but for a given prefix of the
category path.
If the given length is negative or bigger than the path's actual length, the full path is taken.
get
in interface TaxonomyWriterCache
public boolean put(CategoryPath categoryPath, int ordinal)
TaxonomyWriterCache
If the implementation keeps only a partial cache (e.g., an LRU cache)
and finds that its cache is full, it should clear up part of the cache
and return true
. Otherwise, it should return
false
.
The reason why the caller needs to know if part of the cache was cleared is that in that case it will have to commit its on-disk index (so that all the latest category additions can be searched on disk, if we can't rely on the cache to contain them).
Ordinals should be non-negative. Currently there is no defined way to specify that a cache should remember a category does NOT exist. It doesn't really matter, because normally the next thing we do after finding that a category does not exist is to add it.
put
in interface TaxonomyWriterCache
public boolean put(CategoryPath categoryPath, int prefixLen, int ordinal)
TaxonomyWriterCache
TaxonomyWriterCache.put(CategoryPath, int)
, but for a given prefix of the
category path.
If the given length is negative or bigger than the path's actual length, the full path is taken.
put
in interface TaxonomyWriterCache
public int getMemoryUsage()