A maximum entropy approach to natural language processing
Computational Linguistics
Learning to Parse Natural Language with Maximum Entropy Models
Machine Learning - Special issue on natural language learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Automatic text categorization in terms of genre and author
Computational Linguistics
A comparison of algorithms for maximum entropy parameter estimation
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Content based SMS spam filtering
Proceedings of the 2006 ACM symposium on Document engineering
Online supervised spam filter evaluation
ACM Transactions on Information Systems (TOIS)
Feature engineering for mobile (SMS) spam filtering
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Spam filtering for short messages
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
The contribution of stylistic information to content-based mobile spam filtering
ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Hi-index | 0.11 |
The feature of brevity in mobile phone messages makes it difficult to distinguish lexical patterns to identify spam. This paper proposes a novel approach to spam classification of extremely short messages using not only lexical features that reflect the content of a message but new stylistic features that indicate the manner in which the message is written. Experiments on two mobile phone message collections in two different languages show that the approach outperforms previous content-based approaches significantly, regardless of language.