A comparative study of cuckoo search and bat algorithm for Bloom filter optimisation in spam filtering

  • Authors:
  • Arulanand Natarajan;S. Subramanian;K. Premalatha

  • Affiliations:
  • Anna University of Technology Coimbatore, Academic Campus, Jothipuram Post, Veerapandi Pirivu, Coimbatore –/ 641 047, Tamil Nadu, India.;Sri Krishna College of Engineering and Technology, Kuniamuthur, Coimbatore –/ 641 008, Tamil Nadu, India.;Department of Computer Science and Engineering, Bannari Amman Institute of Technology, Alathukombai Post, Sathyamangalam, Erode Dt #150/ 638 401, Tamil Nadu, India

  • Venue:
  • International Journal of Bio-Inspired Computation
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Bloom filter (BF) is a simple but powerful data structure that can check membership to a static set. The trade-off to use Bloom filter is a certain configurable risk of false positives. The odds of a false positive can be made very low if the hash bitmap is sufficiently large. Spam is an irrelevant or inappropriate message sent on the internet to a large number of newsgroups or users. A spam word is a list of well-known words that often appear in spam mails. The proposed system of bin Bloom filter (BBF) groups the words into number of bins with different false positive rates based on the weights of the spam words. Cuckoo search (CS) and bat algorithm are bio-inspired algorithms that imitate the way cuckoo breeding and microbat foraging behaviours respectively. This paper demonstrates the CS and bat algorithm for minimising the total membership invalidation cost of the BBFs by finding the optimal false positive rates and number of elements stored in every bin. The experimental results demonstrate the application of CS and bat algorithm for various numbers of bins and strings.