The nature of statistical learning theory
The nature of statistical learning theory
Training algorithms for linear text classifiers
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
An algorithm for suffix stripping
Readings in information retrieval
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Reducing the space requirement of suffix trees
Software—Practice & Experience
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Naive Bayesian Classification of Structured Data
Machine Learning
Text Mining: Predictive Methods for Analyzing Unstructured Information
Text Mining: Predictive Methods for Analyzing Unstructured Information
An evaluation of statistical spam filtering techniques
ACM Transactions on Asian Language Information Processing (TALIP)
A comparison of event models for Naive Bayes anti-spam e-mail filtering
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
A new suffix tree similarity measure for document clustering
Proceedings of the 16th international conference on World Wide Web
Spam Filtering Using Statistical Data Compression Models
The Journal of Machine Learning Research
Behavior-based spam detection using a hybrid method of rule-based techniques and neural networks
Expert Systems with Applications: An International Journal
Review: A review of machine learning approaches to Spam filtering
Expert Systems with Applications: An International Journal
Adaptive context modeling for deception detection in emails
MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
Collective suffix tree-based models for location prediction
Proceedings of the 2013 ACM conference on Pervasive and ubiquitous computing adjunct publication
Hi-index | 0.00 |
We present an approach to email filtering based on the suffix tree data structure. A method for the scoring of emails using the suffix tree is developed and a number of scoring and score normalisation functions are tested. Our results show that the character level representation of emails and classes facilitated by the suffix tree can significantly improve classification accuracy when compared with the currently popular methods, such as naive Bayes. We believe the method can be extended to the classification of documents in other domains.