@Deprecated public class JaspellTernarySearchTrie extends Object implements Accountable
String
objects that combines the compact size of a binary search
tree with the speed of a digital search trie, and is therefore ideal for
practical use in sorting and searching data.
This data structure is faster than hashing for many typical search problems, and supports a broader range of useful problems and operations. Ternary searches are faster than hashing and more powerful, too.
The theory of ternary search trees was described at a symposium in 1997 (see "Fast Algorithms for Sorting and Searching Strings," by J.L. Bentley and R. Sedgewick, Proceedings of the 8th Annual ACM-SIAM Symposium on Discrete Algorithms, January 1997). Algorithms in C, Third Edition, by Robert Sedgewick (Addison-Wesley, 1998) provides yet another view of ternary search trees.
Modifier and Type | Class and Description |
---|---|
protected class |
JaspellTernarySearchTrie.TSTNode
Deprecated.
An inner class of Ternary Search Trie that represents a node in the trie.
|
Constructor and Description |
---|
JaspellTernarySearchTrie()
Deprecated.
Constructs an empty Ternary Search Trie.
|
JaspellTernarySearchTrie(Locale locale)
Deprecated.
Constructs an empty Ternary Search Trie,
specifying the Locale used for lowercasing.
|
JaspellTernarySearchTrie(Path file)
Deprecated.
Constructs a Ternary Search Trie and loads data from a
Path
into the Trie. |
JaspellTernarySearchTrie(Path file,
boolean compression)
Deprecated.
Constructs a Ternary Search Trie and loads data from a
File
into the Trie. |
Modifier and Type | Method and Description |
---|---|
Object |
get(CharSequence key)
Deprecated.
Retrieve the object indexed by a key.
|
Float |
getAndIncrement(String key)
Deprecated.
Retrieve the
Float indexed by key, increment it by one unit
and store the new Float . |
Collection<Accountable> |
getChildResources()
Deprecated.
|
protected String |
getKey(JaspellTernarySearchTrie.TSTNode node)
Deprecated.
Returns the key that indexes the node argument.
|
JaspellTernarySearchTrie.TSTNode |
getNode(CharSequence key)
Deprecated.
Returns the node indexed by key, or
null if that node doesn't
exist. |
protected JaspellTernarySearchTrie.TSTNode |
getNode(CharSequence key,
JaspellTernarySearchTrie.TSTNode startNode)
Deprecated.
Returns the node indexed by key, or
null if that node doesn't
exist. |
protected JaspellTernarySearchTrie.TSTNode |
getOrCreateNode(CharSequence key)
Deprecated.
Returns the node indexed by key, creating that node if it doesn't exist,
and creating any required intermediate nodes if they don't exist.
|
List<String> |
matchAlmost(CharSequence key,
int numReturnValues)
Deprecated.
Returns a
List of keys that almost match the argument key. |
List<String> |
matchAlmost(String key)
Deprecated.
Returns a
List of keys that almost match the argument key. |
List<String> |
matchPrefix(CharSequence prefix,
int numReturnValues)
Deprecated.
Returns an alphabetical
List of all keys in the trie that
begin with a given prefix. |
List<String> |
matchPrefix(String prefix)
Deprecated.
Returns an alphabetical
List of all keys in the trie that
begin with a given prefix. |
int |
numDataNodes()
Deprecated.
Returns the number of nodes in the trie that have non-null data.
|
protected int |
numDataNodes(JaspellTernarySearchTrie.TSTNode startingNode)
Deprecated.
Returns the number of nodes in the subtrie below and including the starting
node.
|
int |
numNodes()
Deprecated.
Returns the total number of nodes in the trie.
|
protected int |
numNodes(JaspellTernarySearchTrie.TSTNode startingNode)
Deprecated.
Returns the total number of nodes in the subtrie below and including the
starting Node.
|
void |
put(CharSequence key,
Object value)
Deprecated.
Stores a value in the trie.
|
long |
ramBytesUsed()
Deprecated.
|
void |
remove(String key)
Deprecated.
Removes the value indexed by key.
|
void |
setMatchAlmostDiff(int diff)
Deprecated.
Sets the number of characters by which words can differ from target word
when calling the
matchAlmost method. |
void |
setNumReturnValues(int num)
Deprecated.
Sets the default maximum number of values returned from the
matchPrefix and matchAlmost methods. |
protected List<String> |
sortKeys(JaspellTernarySearchTrie.TSTNode startNode,
int numReturnValues)
Deprecated.
Returns keys sorted in alphabetical order.
|
public JaspellTernarySearchTrie()
public JaspellTernarySearchTrie(Locale locale)
public JaspellTernarySearchTrie(Path file) throws IOException
Path
into the Trie. The file is a normal text document, where each line is of
the form word TAB float.file
- The Path
with the data to load into the Trie.IOException
- A problem occured while reading the data.public JaspellTernarySearchTrie(Path file, boolean compression) throws IOException
File
into the Trie. The file is a normal text document, where each line is of
the form "word TAB float".file
- The File
with the data to load into the Trie.compression
- If true, the file is compressed with the GZIP algorithm, and if
false, the file is a normal text document.IOException
- A problem occured while reading the data.public Object get(CharSequence key)
key
- A String
index.public Float getAndIncrement(String key)
Float
indexed by key, increment it by one unit
and store the new Float
.key
- A String
index.Float
retrieved from the Ternary Search Trie.protected String getKey(JaspellTernarySearchTrie.TSTNode node)
node
- The node whose index is to be calculated.String
that indexes the node argument.public JaspellTernarySearchTrie.TSTNode getNode(CharSequence key)
null
if that node doesn't
exist. Search begins at root node.key
- A String
that indexes the node that is returned.TernarySearchTrie.TSTNode
.protected JaspellTernarySearchTrie.TSTNode getNode(CharSequence key, JaspellTernarySearchTrie.TSTNode startNode)
null
if that node doesn't
exist. The search begins at root node.key
- A String
that indexes the node that is returned.startNode
- The top node defining the subtrie to be searched.TernarySearchTrie.TSTNode
.protected JaspellTernarySearchTrie.TSTNode getOrCreateNode(CharSequence key) throws NullPointerException, IllegalArgumentException
key
- A String
that indexes the node that is returned.TernarySearchTrie.TSTNode
.NullPointerException
- If the key is null
.IllegalArgumentException
- If the key is an empty String
.public List<String> matchAlmost(String key)
List
of keys that almost match the argument key.
Keys returned will have exactly diff characters that do not match the
target key, where diff is equal to the last value passed in as an argument
to the setMatchAlmostDiff
method.
If the matchAlmost
method is called before the
setMatchAlmostDiff
method has been called for the first time,
then diff = 0.
key
- The target key.List
with the results.public List<String> matchAlmost(CharSequence key, int numReturnValues)
List
of keys that almost match the argument key.
Keys returned will have exactly diff characters that do not match the
target key, where diff is equal to the last value passed in as an argument
to the setMatchAlmostDiff
method.
If the matchAlmost
method is called before the
setMatchAlmostDiff
method has been called for the first time,
then diff = 0.
key
- The target key.numReturnValues
- The maximum number of values returned by this method.List
with the resultspublic List<String> matchPrefix(String prefix)
List
of all keys in the trie that
begin with a given prefix. Only keys for nodes having non-null data are
included in the List
.prefix
- Each key returned from this method will begin with the characters
in prefix.List
with the results.public List<String> matchPrefix(CharSequence prefix, int numReturnValues)
List
of all keys in the trie that
begin with a given prefix. Only keys for nodes having non-null data are
included in the List
.prefix
- Each key returned from this method will begin with the characters
in prefix.numReturnValues
- The maximum number of values returned from this method.List
with the resultspublic int numDataNodes()
protected int numDataNodes(JaspellTernarySearchTrie.TSTNode startingNode)
startingNode
- The top node of the subtrie. the node that defines the subtrie.public int numNodes()
protected int numNodes(JaspellTernarySearchTrie.TSTNode startingNode)
startingNode
- The top node of the subtrie. The node that defines the subtrie.public void put(CharSequence key, Object value)
key
- A String
that indexes the object to be stored.value
- The object to be stored in the Trie.public void remove(String key)
key
- A string
that indexes the object to be removed from
the Trie.public void setMatchAlmostDiff(int diff)
matchAlmost
method.
Arguments less than 0 will set the char difference to 0, and arguments greater than 3 will set the char difference to 3.
diff
- The number of characters by which words can differ from target
word.public void setNumReturnValues(int num)
matchPrefix
and matchAlmost
methods.
The value should be set this to -1 to get an unlimited number of return values. note that the methods mentioned above provide overloaded versions that allow you to specify the maximum number of return values, in which case this value is temporarily overridden.
num
- The number of values that will be returned when calling the
methods above.protected List<String> sortKeys(JaspellTernarySearchTrie.TSTNode startNode, int numReturnValues)
The number of keys returned is limited to numReturnValues. To get a list that isn't limited in size, set numReturnValues to -1.
startNode
- The top node defining the subtrie to be searched.numReturnValues
- The maximum number of values returned from this method.List
with the results.public long ramBytesUsed()
ramBytesUsed
in interface Accountable
public Collection<Accountable> getChildResources()
getChildResources
in interface Accountable
Copyright © 2000-2015 Apache Software Foundation. All Rights Reserved.