home | map contact
Code

Projects

htable | oldrunner

HTable

This is a lightweight implementation of hash tables in C, greatly inspired by the implementations of spray and red-black trees found in *BSD kernels. To use it, you only need to copy the header file "htable.h" into your project.

Source

You can browse the sources online using CVSweb.

The latest package is available here: htable-1.2.tar.gz

MD5 (htable-1.2.tar.gz) = c1ac77f9822bc4040bccbc91096c51d5
RMD160 (htable-1.2.tar.gz) = 95a3ac55b11187e68853690cf2f85105687e9f79
SHA1 (htable-1.2.tar.gz) = 028437fd55cc660d9e4f8a46ade97fc2f7bfd0f3
SIZE (htable-1.2.tar.gz) = 11134

Manpage

HTABLE(3)			 htable manual			     HTABLE(3)



NAME
       HTABLE_HEAD,  HTABLE_ENTRY,  HTABLE_SIZE,  HTABLE_COUNT,  HTABLE_EMPTY,
       HTABLE_COLLS, HTABLE_LOAD, HTABLE_INITIALIZER, HTABLE_INIT, HTABLE_PRO-
       TOTYPE,	HTABLE_GENERATE,  HTABLE_INSERT, HTABLE_REMOVE, HTABLE_LOOKUP,
       HTABLE_FIRST,  HTABLE_NEXT,  HTABLE_FOREACH,  implementation  of   hash
       tables.


SYNOPSIS
       #include "htable.h"

       HTABLE_HEAD(NAME, SIZE, TYPE);

       HTABLE_ENTRY(TYPE);

       size_t
       HTABLE_SIZE(HTABLE_HEAD *head);

       uint32_t
       HTABLE_COUNT(HTABLE_HEAD *head);

       int
       HTABLE_EMPTY(HTABLE_HEAD *head);

       float
       HTABLE_COLLS(HTABLE_HEAD *head);

       float
       HTABLE_LOAD(HTABLE_HEAD *head);

       HTABLE_INITIALIZER(HTABLE_HEAD *head);

       HTABLE_INIT(HTABLE_HEAD *head);

       HTABLE_PROTOTYPE(NAME, TYPE);

       HTABLE_GENERATE(NAME, TYPE, KEY, CMP);

       HTABLE_ENTRY *
       HTABLE_INSERT(NAME, HTABLE_HEAD *head, HTABLE_ENTRY *elm);

       HTABLE_ENTRY *
       HTABLE_REMOVE(NAME, HTABLE_HEAD *head, HTABLE_ENTRY *elm);

       HTABLE_ENTRY *
       HTABLE_LOOKUP(NAME, HTABLE_HEAD *head, HTABLE_ENTRY *elm);

       HTABLE_ENTRY *
       HTABLE_FIRST(NAME, HTABLE_HEAD *head);

       HTABLE_ENTRY *
       HTABLE_NEXT(NAME, HTABLE_HEAD *head, HTABLE_ENTRY *elm);

       HTABLE_FOREACH(HTABLE_ENTRY *elm, NAME, HTABLE_HEAD *head);


DESCRIPTION
       These  macros  define  and operate on a hash table data structure.  The
       following functionalities are supported:

	      1   Insertion of a new entry in the hash table.

	      2   Retrieval of an entry in the hash table.

	      3   Removal of an entry from the hash table.

	      4   Iterating over all entries found in the hash table.

	      5   Computing the number of entries found in the hash table.

	      6   Computing the collision percentage for the hash table.

	      7   Computing the load percentage for the hash table.

       Hash tables are ideal for applications with datasets needing a  lot  of
       adding, searching or removal, as those are normally constant-time oper-
       ations.	The primary operation it supports  efficiently	is  a  lookup:
       given  a key (e.g. a person's name), find the corresponding value (e.g.
       that person's telephone number). It works by transforming the key using
       a  hash	function  into a hash, a number that is used as an index in an
       array to locate the desired location ("bucket") where the values should
       be.

       Hash tables support the efficient insertion of new entries, in expected
       O(1) time. The time spent in searching depends on the hash function and
       the  load  of  the  hash table; both insertion and search approach O(1)
       time with well chosen values and hashes.


HASH TABLES
       In the macro definitions, TYPE is the name tag of a user defined struc-
       ture  that  must contain a field of type HTABLE_ENTRY.  For example, to
       define a data structure looking like a phone book that will  be	stored
       in a hash table, one could write something like:

	     struct phonebook_s {
	       char *name, *phone;
	       HTABLE_ENTRY (phonebook_s);
	     };

       The  argument  NAME  in the macro definitions is the name tag of a user
       defined structure that must be declared using the  macro  HTABLE_HEAD()
       as follows:

	     HTABLE_HEAD (NAME, SIZE, TYPE) hash_table;

       The  argument  NAME has to be a unique name prefix for every hash table
       that is defined. SIZE is the number of  buckets	the  hash  table  will
       hold.   A  pointer  to  such a hash table structure could then later be
       defined as:

	     struct NAME *tableptr;

       Once a hash table  was  defined,  it  must  be  initialized  using  the
       HTABLE_INIT()  macro, head being a reference to this hash table.  It is
       also possible to initialize it statically by using the  HTABLE_INITIAL-
       IZER() macro like this:

	     HTABLE_HEAD (NAME, SIZE, TYPE) htable =
	       HTABLE_INITIALIZER (&htable);

       In order to use the functions that manipulate the hash table structure,
       their prototypes need to be declared with the HTABLE_PROTOTYPE() macro,
       where  NAME  is a unique identifier for this particular hash_table. The
       TYPE argument is the type of the structure that is being managed by the
       hash table.

       The  function  bodies  are  generated with the HTABLE_GENERATE() macro,
       which must be used only once.  It takes the same two first arguments as
       the  HTABLE_PROTOTYPE() macro, and the two last arguments are the names
       of user-defined functions used to extract key information from  a  hash
       table entry and to compare two entries.

       The  function  used  to retrieve information related to the key given a
       hash table entry must have the following prototype:

	void (*key) (HTABLE_ENTRY *elm, char **key, int *len);

       where elm is the given pointer to the hash table  entry,  key  and  len
       must  be  filled  in with respectively the pointer to the corresponding
       key and with this key's length.

       The function used to compare two hash tables entries must  follow  this
       prototype:

	int (*cmp) (HTABLE_ENTRY *elm1, HTABLE_ENTRY *elm2);

       where  elm1  and  elm2  are the entries to compare.  This function must
       return an integer value, being 0 in case the  keys  are	equal,	and  a
       value different from 0 otherwise.

       See section EXAMPLE for possible implementations of such functions.

       The  HTABLE_INSERT()  macro  inserts  the element elm in the hash table
       pointed at by head. A pointer to the element is returned in case it was
       successfully  inserted. Otherwise, NULL is returned, meaning the inser-
       tion did not occur (e.g. the element was already stored in the hash ta-
       ble).

       The  HTABLE_REMOVE()  macro removes the element elm from the hash table
       pointed at by head.  The removed element is returned to the user so  it
       can  be	freed  if  necessary.  If  the	element was not found, NULL is
       returned.

       The HTABLE_LOOKUP() macro finds the  element  elm  in  the  hash  table
       pointed	at  by	head. The data corresponding to the removed element is
       returned to the user (NULL is returned in  case	the  element  was  not
       found).

       The HTABLE_FIRST() and HTABLE_NEXT() macros can be used to traverse the
       hash table:

	     for (elm = HTABLE_FIRST (NAME, &head);
		  elm != NULL;
		  elm = HTABLE_NEXT (NAME, &head, elm))

       Or, for simplicity, one can use the HTABLE_FOREACH() macro:


	     HTABLE_FOREACH (elm, NAME, &head)

       There are also some macros useful to get information about a given hash
       table:

       The  HTABLE_SIZE()  macro returns the total number of buckets contained
       in the hash table pointed at by head.

       The HTABLE_COUNT() returns the number of items contained  in  the  hash
       table pointed at by head.

       The HTABLE_COLLS() returns a percentage indicating the collisions (e.g.
       when two keys hash to the same bucket) there  are  in  the  hash  table
       pointed at by head.

       The HTABLE_LOAD() macro returns a percentage indicating the load factor
       (e.g. the number of filled buckets over the total number of buckets) of
       the hash table pointed at by head.

       The HTABLE_EMPTY() macro should be used to check wether a hash table is
       empty.


EXAMPLES
       The following example demonstrates how to declare a hash table.	Values
       are  inserted  into it, and one of them is then retrieved from the hash
       table.  Next, the contents of the hash table are printed, and one  ele-
       ment  is  finally removed. Last, the total number of items contained in
       the hash table is displayed.

	  #include <stdlib.h>
	  #include <stdio.h>
	  #include <string.h>

	  #include "htable.h"

	  #define HSIZE  100

	  struct book_s {
	    char *name, *phone;
	    HTABLE_ENTRY (book_s);
	  };

	  void
	  extract_key (struct book_s *data, char **key, int *len)
	  {
	    *key = data->name;
	    *len = strlen (data->name);
	  }

	  int
	  compare (struct book_s *data1, struct book_s *data2)
	  {
	    const int KEYLEN = strlen (data1->name);

	    if (strlen (data2->name) == KEYLEN
		&& !memcmp (data1->name, data2->name, KEYLEN))
	      return 0;
	    else
	      return 1;
	  }

	  int
	  main ()
	  {
	    int i;
	    struct book_s *elm;
	    struct book_s entries[] = {
	      {"friend1", "555-1111"},
	      {"friend2", "555-2222"},
	      {"person3", "555-3333"},
	      {"person4", "555-4444"}
	    };
	    const int NOENTRIES = sizeof (entries) / sizeof (struct book_s);

	    HTABLE_HEAD (pbook, HSIZE, book_s) htable =
	      HTABLE_INITIALIZER (&htable);

	    HTABLE_GENERATE (pbook, book_s, extract_key, compare);

	    for (i = 0; i < NOENTRIES; i++)
	      HTABLE_INSERT (pbook, &htable, &entries[i]);

	    elm = HTABLE_LOOKUP (pbook, &htable, &entries[1]);
	    printf ("friend2's Phone number is: %s\n", elm->phone);

	    HTABLE_FOREACH (elm, pbook, &htable)
	      {
		printf ("Entry:\n");
		printf (" name: %s\n", elm->name);
		printf ("phone: %s\n", elm->phone);
	      }

	    elm = HTABLE_REMOVE (pbook, &htable, &entries[2]);

	    printf ("Number of items in hash table: %u\n",
		    HTABLE_COUNT (&htable));

	    return EXIT_SUCCESS;
	  }


NOTES
       If the hash table macros need to be used several times, it  is  advised
       to  build wrappers around them, as code is inlined and executable could
       have its size grow needlessly. For example, to remove elements  from  a
       hash  table  and  free the corresponding data structure associated with
       it, one could write the following function:

	  /*
	   * Wrapper around the HTABLE_REMOVE macro.
	   *
	   * A hash table was previously defined using:
	   *	HTABLE_HEAD (my_hash, HSIZE, my_entry) htable =
	   *	  HTABLE_INITIALIZER (&htable);
	   */
	  void
	  htable_free (struct my_hash *ht, struct my_entry *elm)
	  {
	    struct my_entry *removed;

	    removed = HTABLE_REMOVE (my_hash, ht, elm);
	    if (removed != NULL)
	      free (removed);
	  }


HASH FUNCTIONS
       By default, Jenkin's hash function "LOOKUP" is used to transform a  key
       into  a bucket number (reference can be found in the SEE ALSO section).
       However, other hash functions are available and can be chosen  at  com-
       pile time by defining the HASH_FUNCTION macro.

       The following functions are available:

       HASH_JEN
	      The default hash function, Jenkins' Lookup hash.

       HASH_OAT
	      Jenkins' "One at a time" hash function.

       For  example,  to  specify  that Jenkins' "One at a time" hash function
       must be used for the "test" program, one must compile it using  a  com-
       mand such as:

	  cc -I/path/to/htable.h -DHASH_FUNCTION=HASH_OAT -o test test.c

       To  determine  the  best hash function for your key domain, you can use
       the HTABLE_COLLS and HTABLE_LOAD macros to compare the  collisions  and
       load factors obtained with the different hash functions.


SEE ALSO
       Bob  Jenkins' work on hash functions can be found at: http://burtlebur-
       tle.net/bob/hash/

       Those macros were greatly inspired by the implementations of spray  and
       red-black    trees    found    in    the   *BSD	 kernels   (see   file
       /usr/src/sys/sys/tree.h).


AUTHORS
       Frederic Culot <frederic at culot.org>.



Version 1.2			August 19, 2010 		     HTABLE(3)