Optimal XOR hashing for non-uniformly distributed address lookup in computer networks

  • Authors:
  • Christopher J. Martinez;Wei-Ming Lin;Parimal Patel

  • Affiliations:
  • Department of Electrical and Computer Engineering, The University of Texas, San Antonio, TX 78249-0669, USA;Department of Electrical and Computer Engineering, The University of Texas, San Antonio, TX 78249-0669, USA;Department of Electrical and Computer Engineering, The University of Texas, San Antonio, TX 78249-0669, USA

  • Venue:
  • Journal of Network and Computer Applications
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Hashing algorithms have been widely adopted to provide a fast address lookup process which involves a search through a large database to find a record associated with a given key. Modern examples include address lookup in network routers for a forwarding outgoing link, rule-matching in intrusion detection systems comparing incoming packets with a large database, etc. Hashing algorithms involve transforming a key inside each target data to a hash value hoping that the hashing would render the database a uniform distribution with respect to this new hash value. When the database are already key-wise uniformly distributed, any regular hashing algorithm would easily lead to perfectly uniform distribution after the hashing. On the other hand, if records in the database are instead not uniformly distributed, then different hashing functions would lead to different performance. This paper addresses the cases when such distribution follows a natural negative linear distribution, a partial negative linear distribution, or an exponential distribution which are found to closely approximate many real-life database distributions. For each of these distributions, we derive a general formula for calculating the distribution variance produced by any given non-overlapping bit-grouping XOR hashing function. Such a distribution variance from the hashing directly translates to performance variations in searching. Through this, the best XOR hashing function can be easily determined for any given key size and any given hashing target size.