A Personalized Spam Filtering Approach Utilizing Two Separately Trained Filters
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
An innovative analyser for multi-classifier e-mail classification based on grey list analysis
Journal of Network and Computer Applications
Symbiotic Data Mining for Personalized Spam Filtering
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
e-mail authentication system: a spam filtering for smart senders
Proceedings of the 2nd International Conference on Interaction Sciences: Information Technology, Culture and Human
Symbiotic filtering for spam email detection
Expert Systems with Applications: An International Journal
Batch-Mode Active Learning with Semi-supervised Cluster Tree for Text Classification
WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Hi-index | 0.00 |
The proliferation of unsolicited emails, also known as spam, poses significant burden to email users worldwide. Recent researches on spam filtering have shown that high accuracies can be obtained if labeled emails examples are available from the particular user of the spam filter. However, the time consuming process of providing personalized labeled training examples is often inconvenient or impossible due to privacy issues. In this paper, a semi-supervised personalized spam filter based on classifier ensemble is proposed that classifies user's emails accurately by learning on both generic labeled emails and personalized unlabeled emails. The proposed multi-stage classification process begins learning a SVM model from labeled generic data. Unlabeled user's emails are then fed to this SVM to generate personalized labeled data for constructing personalized naive Bayes classifiers. Furthermore, some personalized labeled examples are generated by exploiting rare word distributions and then fed into a semi-supervised classifier. The multi-stage results are integrated with SVMs learned from generic labeled emails to produce the final classification results. Experimental results show that the proposed approaches can significantly increases the classification accuracy in spam filtering.