Performance standards and evaluations in IR test collections: cluster-based retrieval models
Information Processing and Management: an International Journal
Information Retrieval
Text classification using string kernels
The Journal of Machine Learning Research
Fast String Kernels using Inexact Matching for Protein Sequences
The Journal of Machine Learning Research
A Comparative Impact Study of Attribute Selection Techniques on Naïve Bayes Spam Filters
ICDM '08 Proceedings of the 8th industrial conference on Advances in Data Mining: Medical Applications, E-Commerce, Marketing, and Theoretical Aspects
A study of cross-validation and bootstrap for accuracy estimation and model selection
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Grindstone4Spam: An optimization toolkit for boosting e-mail classification
Journal of Systems and Software
Hi-index | 0.00 |
Unsolicited commercial e-mail (UCE), more commonly known as spam is a growing problem on the Internet. Every day people receive lots of unwanted advertising e-mails that flood their mailboxes. Fortunately, there are several approaches for spam filtering able to detect and automatically delete this kind of messages. However, spammers have adopted some techniques to reduce the effectiveness of these filters by introducing noise in their messages. This work presents a new pre-processing technique for noise identification and reduction, showing preliminary results when it is applied with a Flexible Bayes classifier. The experimental analysis confirms the advantages of using the proposed technique in order to improve spam filters accuracy.