A decision-theoretic roguth set model
Methodologies for intelligent systems, 5
Classifying news stories using memory based reasoning
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
Explicitly representing expected cost: an alternative to ROC representation
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
BoosTexter: A Boosting-based Systemfor Text Categorization
Machine Learning - Special issue on information retrieval
Machine Learning
Training Cost-Sensitive Neural Networks with Methods Addressing the Class Imbalance Problem
IEEE Transactions on Knowledge and Data Engineering
The foundations of cost-sensitive learning
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
The superiority of three-way decisions in probabilistic rough set models
Information Sciences: an International Journal
A three-way decision approach to email spam filtering
AI'10 Proceedings of the 23rd Canadian conference on Advances in Artificial Intelligence
Using restrictive classification and meta classification for junk elimination
ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
A comparison study of cost-sensitive classifier evaluations
BI'12 Proceedings of the 2012 international conference on Brain Informatics
Hi-index | 0.00 |
Email spam filtering is typically treated as a binary classification problem that can be solved by machine learning algorithms. We argue that a three-way decision approach provides a more meaningful way to users for precautionary handling their incoming emails. Three email folders instead of two are produced in a three-way spam filtering system, a suspected folder is added to allow users make further examinations of suspicious emails, thereby reducing the chances of misclassification. Different from existing ternary email spam filtering systems, we focus on two issues that are less studied, that is, the computation of required thresholds to define the three email categories, and the interpretation of the cost-sensitive characteristics of spam filtering. Instead of supplying the thresholds based on intuitive understandings of the levels of tolerance for errors, we systematically calculate the thresholds based on decision-theoretic rough set model. A loss function is interpreted as the costs of making classification decisions. A decision is made for which the overall cost is minimum. Experimental results show that the new approach reduces the error rate of misclassifying a legitimate email to spam and demonstrates a better performance for the cost-sensitivity aspect.