Training algorithms for linear text classifiers
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Support vector machines for spam categorization
IEEE Transactions on Neural Networks
Hi-index | 0.01 |
This paper gives a comparative study of feature selection methods in spam-mail filtering. In our experiment, the fuzzy inference method showed about 6% and 10% improvements over information gain and χ2-test as a feature selection method in terms of the average error rate which is more important than typical information retrieval measures. Since it is not easy to reduce error rate, our work can be regarded as a meaningful research for email users suffering from unsolicited emails flooding indiscriminately.