TutHashCreate create a hash table
T_HASH_TABLE TutHashCreate(init_size
,hash_func
,test_func
) T_INT4init_size
; T_HASH_CALC_FUNChash_func
; T_HASH_TEST_FUNCtest_func
;
init_size
initial size of the hash table
hash_func
function to calculate the hash code for a key
test_func
function to compare two keys
Returns a new hash table.
None
TutHashCreate creates a new hash table. A hash table consists of keys and values, where each key can be used to look up the corresponding value (hash tables have also been called symbol tables and dictionaries). Values can be inserted, deleted, and looked up very quickly. Hash tables provide a good balance between time used and storage used.
A key can be any pointer-sized (usually 4 bytes) data type (such as a character string, an integer, or any generic pointer). Each hash table normally uses the same data type for all of its keys; for example, all keys may be strings or integers, but not both. SmartSockets provides convenience functions that create hash tables with these types of keys:
SmartSockets hash tables consist of a hash function hash_func
, a test function test_func
, and an array of buckets. When a value is looked up from its key, the key is hashed using hash_func
to provide an index into the bucket array. Each bucket is a linked list of (key, value) pairs. SmartSockets hash tables use open hashing. In open hashing, different keys can hash to the same bucket. To find the proper (key, value) pair from the bucket, the bucket is then searched linearly using test_func
to find the proper key.
One problem with hash tables is knowing how big to make the bucket array. If the array of buckets is large enough and the hash function provides a fairly random spread of bucket indices, then on average each bucket only has one (key, value) pair stored in it. This provides for constant time for each look up. If the bucket array is too large, then storage is wasted. If the bucket array is too small, then each bucket has more than one (key, value) pair, and the time for each look up increases. (Each bucket is searched using a linear search.)
To avoid wasting both storage and time, SmartSockets hash tables dynamically resize themselves if they get too full. Each time a new value is inserted into a hash table, the hash table checks how many entries it has and how large its bucket array is. If the number of entries is approaching the number of buckets, then the size of the bucket array is doubled, and the existing entries are redistributed throughout the new bucket array. Resizing never takes place on a look up, only on an insert.
The initial size of the hash table bucket array is specified by init_size
. The initial size should be large enough to prevent numerous resizes as entries are inserted, yet small enough to prevent wasted storage.
Write the hash function hash_func
to take one argument: the key it is hashing. hash_func
should hash the key to a hash code and return that hash code. The array index in the bucket array is then calculated by (hash code) modulo (bucket array size). Hash functions should try to provide a wide range of hash codes.
Write the test function test_func
to take two arguments: the two keys that are being compared. test_func
should return TRUE
if the two keys are the same, and FALSE
if the two keys are different.
It is normally not necessary to write hash functions and test functions. The convenience functions TutHashCreateInt, TutHashCreatePtr, TutHashCreateStr, and TutHashCreateStri all create hash tables with predefined hash and test functions.
Hash tables do not manage storage for their keys and values. It is up to the caller to manage the memory allocation of the data structures for the keys and values.
If init_size
is too small, then inserting new entries into the hash table causes the hash table to be resized often.
TutHashCreateInt, TutHashCreatePtr, TutHashCreateStr, TutHashCreateStri, TutHashDestroy
This example creates a simple hash function for character strings:
T_INT4 hash_string(key) T_PTR key; { T_INT4 hash_code = 0; T_STR str_key = (T_STR)key; while (*str_key != ’\0’) { hash_code += *str_key; str_key++;/* move on to next character */
}return hash_code; }
/* hash_string */
This example is a simple test function for character strings.
T_BOOL test_string(key1, key2) T_PTR key1; T_PTR key2; {/* return TRUE if the keys are identical *
/ return (strcmp((T_STR)key1, (T_STR)key2) == 0); }/* test_string */
This example creates a hash table using those functions with enough room initially for 4 entries.
Once the hash table is created, you can store values using TutHashInsert and look them up with TutHashLookup.
/* use the above hash table to map the directions to integers */
TutHashInsert(hash_table, "NORTH", (T_PTR)1); TutHashInsert(hash_table, "EAST", (T_PTR)2); TutHashInsert(hash_table, "SOUTH", (T_PTR)3); TutHashInsert(hash_table, "WEST", (T_PTR)4);/* look up "EAST" */
TutOut("the value for EAST is %d\n", (T_INT4)TutHashLookup(hash_table, "EAST"));
TIBCO SmartSockets™ Utilities Software Release 6.8, July 2006 Copyright © TIBCO Software Inc. All rights reserved www.tibco.com |