A Bayesian method for constructing Bayesian belief networks from databases
Proceedings of the seventh conference (1991) on Uncertainty in artificial intelligence
The automatic identification of stop words
Journal of Information Science
Original Contribution: Stacked generalization
Neural Networks
C4.5: programs for machine learning
C4.5: programs for machine learning
The nature of statistical learning theory
The nature of statistical learning theory
Solving the multiple instance problem with axis-parallel rectangles
Artificial Intelligence
Machine Learning - Special issue on learning with probabilistic representations
Communications of the ACM
A framework for multiple-instance learning
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
A vector space model for automatic indexing
Communications of the ACM
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Expert Systems and Probabiistic Network Models
Expert Systems and Probabiistic Network Models
Modern Information Retrieval
Machine Learning
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Machine Learning
A Memory-Based Approach to Anti-Spam Filtering for Mailing Lists
Information Retrieval
Blackwell Guide to the Philosophy of Computing and Information
Blackwell Guide to the Philosophy of Computing and Information
Introduction to the special issue on word sense disambiguation: the state of the art
Computational Linguistics - Special issue on word sense disambiguation
An empirical study of spam traffic and the use of DNS black lists
Proceedings of the 4th ACM SIGCOMM conference on Internet measurement
An evaluation of statistical spam filtering techniques
ACM Transactions on Asian Language Information Processing (TALIP)
A comparison of event models for Naive Bayes anti-spam e-mail filtering
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Online supervised spam filter evaluation
ACM Transactions on Information Systems (TOIS)
Spam Filtering Using Statistical Data Compression Models
The Journal of Machine Learning Research
Relaxed online SVMs for spam filtering
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Communications of the ACM
Exploiting redundancy in natural language to penetrate Bayesian spam filters
WOOT '07 Proceedings of the first USENIX workshop on Offensive Technologies
Combating Good Word Attacks on Statistical Spam Filters with Multiple Instance Learning
ICTAI '07 Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence - Volume 02
An evaluation of Naive Bayes variants in content-based learning for spam filtering
Intelligent Data Analysis
Semantic Querying of Business Process Models
EDOC '08 Proceedings of the 2008 12th International IEEE Enterprise Distributed Object Computing Conference
Word sense disambiguation: A survey
ACM Computing Surveys (CSUR)
International Journal of Computer Applications in Technology
Supervised Machine Learning: A Review of Classification Techniques
Proceedings of the 2007 conference on Emerging Artificial Intelligence Applications in Computer Engineering: Real Word AI Systems with Applications in eHealth, HCI, Information Retrieval and Pervasive Technologies
A study of cross-validation and bootstrap for accuracy estimation and model selection
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Spam Detection: Technologies for spam detection
Network Security
Support vector machines for spam categorization
IEEE Transactions on Neural Networks
Word sense disambiguation for spam filtering
Electronic Commerce Research and Applications
Semantic security against web application attacks
Information Sciences: an International Journal
Hi-index | 12.05 |
Spam has become a major issue in computer security because it is a channel for threats such as computer viruses, worms and phishing. More than 85% of received e-mails are spam. Historical approaches to combat these messages including simple techniques such as sender blacklisting or the use of e-mail signatures, are no longer completely reliable. Currently, many solutions feature machine-learning algorithms trained using statistical representations of the terms that usually appear in the e-mails. Still, these methods are merely syntactic and are unable to account for the underlying semantics of terms within the messages. In this paper, we explore the use of semantics in spam filtering by representing e-mails with a recently introduced Information Retrieval model: the enhanced Topic-based Vector Space Model (eTVSM). This model is capable of representing linguistic phenomena using a semantic ontology. Based upon this representation, we apply several well-known machine-learning models and show that the proposed method can detect the internal semantics of spam messages.