The automatic identification of stop words
Journal of Information Science
Lexical ambiguity and information retrieval
ACM Transactions on Information Systems (TOIS)
C4.5: programs for machine learning
C4.5: programs for machine learning
Word sense disambiguation and information retrieval
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
The nature of statistical learning theory
The nature of statistical learning theory
Machine Learning - Special issue on learning with probabilistic representations
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
A vector space model for automatic indexing
Communications of the ACM
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Machine Learning
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Machine Learning
A Memory-Based Approach to Anti-Spam Filtering for Mailing Lists
Information Retrieval
Homonymy and polysemy in information retrieval
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
An empirical study of spam traffic and the use of DNS black lists
Proceedings of the 4th ACM SIGCOMM conference on Internet measurement
An evaluation of statistical spam filtering techniques
ACM Transactions on Asian Language Information Processing (TALIP)
A comparison of event models for Naive Bayes anti-spam e-mail filtering
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
HLT '93 Proceedings of the workshop on Human Language Technology
SenseLearner: word sense disambiguation for all words in unrestricted text
ACLdemo '05 Proceedings of the ACL 2005 on Interactive poster and demonstration sessions
Spam Filtering Using Statistical Data Compression Models
The Journal of Machine Learning Research
Relaxed online SVMs for spam filtering
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Communications of the ACM
Exploiting redundancy in natural language to penetrate Bayesian spam filters
WOOT '07 Proceedings of the first USENIX workshop on Offensive Technologies
An evaluation of Naive Bayes variants in content-based learning for spam filtering
Intelligent Data Analysis
Word sense disambiguation: A survey
ACM Computing Surveys (CSUR)
International Journal of Computer Applications in Technology
Word Sense Disambiguation: Algorithms and Applications
Word Sense Disambiguation: Algorithms and Applications
Automatic thesaurus construction for spam filtering using revised back propagation neural network
Expert Systems with Applications: An International Journal
A study of cross-validation and bootstrap for accuracy estimation and model selection
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Enhanced Topic-based Vector Space Model for semantics-aware spam filtering
Expert Systems with Applications: An International Journal
A Bayesian method for constructing Bayesian belief networks from databases
UAI'91 Proceedings of the Seventh conference on Uncertainty in Artificial Intelligence
Word sense disambiguation for exploiting hierarchical thesauri in text classification
PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Spam Detection: Technologies for spam detection
Network Security
Support vector machines for spam categorization
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
Spam has become a major issue in computer security because it is a channel for threats such as computer viruses, worms, and phishing. More than 86% of received e-mails are spam. Historical approaches to combating these messages, including simple techniques such as sender blacklisting or the use of e-mail signatures, are no longer completely reliable. Many current solutions feature machine-learning algorithms trained using statistical representations of the terms that most commonly appear in such e-mails. However, these methods are merely syntactic and are unable to account for the underlying semantics of terms within messages. In this paper, we explore the use of semantics in spam filtering by introducing a pre-processing step of Word Sense Disambiguation (WSD). Based upon this disambiguated representation, we apply several well-known machine-learning models and show that the proposed method can detect the internal semantics of spam messages.