An ontology enhanced parallel SVM for scalable spam filter training

  • Authors:
  • Godwin Caruana;Maozhen Li;Yang Liu

  • Affiliations:
  • School of Engineering and Design, Brunel University, Uxbridge, Middlesex, UB8 3PH, UK;School of Engineering and Design, Brunel University, Uxbridge, Middlesex, UB8 3PH, UK and The Key Laboratory of Embedded Systems and Service Computing, Ministry of Education, Tongji University, Ch ...;School of Engineering and Design, Brunel University, Uxbridge, Middlesex, UB8 3PH, UK

  • Venue:
  • Neurocomputing
  • Year:
  • 2013

Quantified Score

Hi-index 0.01

Visualization

Abstract

Spam, under a variety of shapes and forms, continues to inflict increased damage. Varying approaches including Support Vector Machine (SVM) techniques have been proposed for spam filter training and classification. However, SVM training is a computationally intensive process. This paper presents a MapReduce based parallel SVM algorithm for scalable spam filter training. By distributing, processing and optimizing the subsets of the training data across multiple participating computer nodes, the parallel SVM reduces the training time significantly. Ontology semantics are employed to minimize the impact of accuracy degradation when distributing the training data among a number of SVM classifiers. Experimental results show that ontology based augmentation improves the accuracy level of the parallel SVM beyond the original sequential counterpart.