A comparative study for content-based dynamic spam classification using four machine learning algorithms

Authors:
Bo Yu;Zong-ben Xu
Affiliations:
School of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China;Institute for Information and System Science, School of Science, Xi'an Jiaotong University, Xi'an 710049, China
Venue:
Knowledge-Based Systems
Year:
2008

Citing 10
Cited 9

The nature of statistical learning theory

The nature of statistical learning theory
Sparse bayesian learning and the relevance vector machine

The Journal of Machine Learning Research
A Neural Network Based Approach to Automated E-Mail Classification

WI '03 Proceedings of the 2003 IEEE/WIC International Conference on Web Intelligence
Adaptive anti-spam filtering for agglutinative languages: a special case for Turkish

Pattern Recognition Letters
Leveraging Social Networks to Fight Spam

Computer
Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)

Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)
2005 Special Issue: Efficient information theoretic strategies for classifier combination, feature extraction and performance evaluation in improving false positives and false negatives for spam e-mail filtering

Neural Networks - 2005 Special issue: IJCNN 2005
Formulations of Support Vector Machines: A Note from an Optimization Point of View

Neural Computation
The evidence framework applied to classification networks

Neural Computation
Support vector machines for spam categorization

IEEE Transactions on Neural Networks

Review: A review of machine learning approaches to Spam filtering

Expert Systems with Applications: An International Journal
Recognition of Western style musical genres using machine learning techniques

Expert Systems with Applications: An International Journal
Study on Ensemble Classification Methods towards Spam Filtering

ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
Automatic checking of alternative texts on web pages

ICCHP'10 Proceedings of the 12th international conference on Computers helping people with special needs: Part I
Using biased discriminant analysis for email filtering

KES'10 Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part I
PCA document reconstruction for email classification

Computational Statistics & Data Analysis
An effective spam filter based on a combined support vector machine approach

International Journal of Internet Technology and Secured Transactions
Credit risk assessment and decision making by a fusion approach

Knowledge-Based Systems
A comparative study on feature selection and adaptive strategies for email foldering using the ABC-DynF framework

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

The growth of email users has resulted in the dramatic increasing of the spam emails during the past few years. In this paper, four machine learning algorithms, which are Naive Bayesian (NB), neural network (NN), support vector machine (SVM) and relevance vector machine (RVM), are proposed for spam classification. An empirical evaluation for them on the benchmark spam filtering corpora is presented. The experiments are performed based on different training set size and extracted feature size. Experimental results show that NN classifier is unsuitable for using alone as a spam rejection tool. Generally, the performances of SVM and RVM classifiers are obviously superior to NB classifier. Compared with SVM, RVM is shown to provide the similar classification result with less relevance vectors and much faster testing time. Despite the slower learning procedure, RVM is more suitable than SVM for spam classification in terms of the applications that require low complexity.