Making large-scale support vector machine learning practical
Advances in kernel methods
Ensembling neural networks: many could be better than all
Artificial Intelligence
Ensemble Methods in Machine Learning
MCS '00 Proceedings of the First International Workshop on Multiple Classifier Systems
An evaluation of statistical spam filtering techniques
ACM Transactions on Asian Language Information Processing (TALIP)
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Spam and the ongoing battle for the inbox
Communications of the ACM - Spam and the ongoing battle for the inbox
Spam Filtering Using Statistical Data Compression Models
The Journal of Machine Learning Research
Relaxed online SVMs for spam filtering
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Email Spam Filtering: A Systematic Review
Foundations and Trends in Information Retrieval
Targeting spam control on middleboxes: Spam detection based on layer-3 e-mail content classification
Computer Networks: The International Journal of Computer and Telecommunications Networking
Review: A review of machine learning approaches to Spam filtering
Expert Systems with Applications: An International Journal
Support vector machines for spam categorization
IEEE Transactions on Neural Networks
Hi-index | 0.01 |
Recently, many scholars make use of fusion of filters to enhance the performance of spam filtering. In the past several years, a lot of effort has been devoted to different ensemble methods to achieve better performance. In reality, how to select appropriate ensemble methods towards spam filtering is an unsolved problem. In this paper, we investigate this problem through designing a framework to compare the performances among various ensemble methods. It is helpful for researchers to fight spam email more effectively in applied systems. The experimental results indicate that online based methods perform well on accuracy, while the off-line batch methods are evidently influenced by the size of data set. When a large data set is involved, the performance of off-line batch methods is not at par with online methods, and in the framework of online methods, the performance of parallel ensemble is better when using complex filters only.