An empirical study of three machine learning methods for spam filtering

Authors:
Chih-Chin Lai
Affiliations:
Department of Computer Science and Information Engineering, National University of Tainan, Taiwan 700, Taiwan
Venue:
Knowledge-Based Systems
Year:
2007

Citing 5
Cited 13

The nature of statistical learning theory

The nature of statistical learning theory
An experimental comparison of naive Bayesian and keyword-based anti-spam filtering with personal e-mail messages

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
An Evaluation of Statistical Approaches to Text Categorization

Information Retrieval
Identifying Junk Electronic Mail in Microsoft Outlook with a Support Vector Machine

SAINT '03 Proceedings of the 2003 Symposium on Applications and the Internet
Support vector machines for spam categorization

IEEE Transactions on Neural Networks

Discovering Knowledge in a Large Organization through Support Vector Machines

ICCS '08 Proceedings of the 8th international conference on Computational Science, Part III
Review: A review of machine learning approaches to Spam filtering

Expert Systems with Applications: An International Journal
String Kernel Based SVM for Internet Security Implementation

ICONIP '09 Proceedings of the 16th International Conference on Neural Information Processing: Part II
An intelligent spam filtering system based on fuzzy clustering

FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 7
A scalable intelligent non-content-based spam-filtering framework

Expert Systems with Applications: An International Journal
Using GMDH-based networks for improved spam detection and email feature analysis

Applied Soft Computing
Segmental parameterisation and statistical modelling of e-mail headers for spam detection

Information Sciences: an International Journal
Review: SMS spam filtering: Methods and data

Expert Systems with Applications: An International Journal
Rough sets for spam filtering: Selecting appropriate decision rules for boundary e-mail classification

Applied Soft Computing
Statistical cross-language Web content quality assessment

Knowledge-Based Systems
A generalized cluster centroid based classifier for text categorization

Information Processing and Management: an International Journal
Effect of feature selection methods on machine learning classifiers for detecting email spams

Proceedings of the 2013 Research in Adaptive and Convergent Systems
Interaction between feature subset selection techniques and machine learning classifiers for detecting unsolicited emails

ACM SIGAPP Applied Computing Review

Quantified Score

Hi-index	0.00

Visualization

Abstract

The increasing volumes of unsolicited bulk e-mail (also known as spam) are bringing more annoyance for most Internet users. Using a classifier based on a specific machine-learning technique to automatically filter out spam e-mail has drawn many researchers' attention. This paper is a comparative study the performance of three commonly used machine learning methods in spam filtering. On the other hand, we try to integrate two spam filtering methods to obtain better performance. A set of systematic experiments has been conducted with these methods which are applied to different parts of an e-mail. Experiments show that using the header only can achieve satisfactory performance, and the idea of integrating disparate methods is a promising way to fight spam.