ICC'09 Proceedings of the 2009 IEEE international conference on Communications
Mining data with random forests: A survey and results of new tests
Pattern Recognition
Hi-index | 0.00 |
Spam is considered an invasion of privacy. Its changeable structures and variability raise the need for new spam classification techniques. The present study proposes using Bayesian Additive Regression Trees (BART) for spam classification and evaluates its performance against other classification methods, including Logistic Regression, Support Vector Machines, Classification and Regression Trees, Neural Networks, Random Forests, and Naive Bayes. BART in its original form is not designed for such problems, hence we modify BART and make it applicable to classification problems. We evaluate the classifiers using three spam datasets; Ling-Spam, PU1, and Spambase to determine the predictive accuracy and the false positive rate.