Machine learning in automated text categorisation
Machine learning in automated text categorisation
An Assessment of Case-Based Reasoning for Spam Filtering
Artificial Intelligence Review
SF-HME system: a hierarchical mixtures-of-experts classification system for spam filtering
Proceedings of the 2006 ACM symposium on Applied computing
Applying lazy learning algorithms to tackle concept drift in spam filtering
Expert Systems with Applications: An International Journal
Web searching, search engines and Information Retrieval
Information Services and Use
Proceedings of the ninth international conference on Electronic commerce
Time-efficient spam e-mail filtering using n-gram models
Pattern Recognition Letters
Email Spam Filtering: A Systematic Review
Foundations and Trends in Information Retrieval
An empirical study of required dimensionality for large-scale latent semantic indexing applications
Proceedings of the 17th ACM conference on Information and knowledge management
Journal of Computer Security
Journal of Computer Security - Best papers of the Sec Track at the 2006 ACM Symposium
ECUE: A Spam Filter that Uses Machine Learning to Track Concept Drift
Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
Supervised latent semantic indexing using adaptive sprinkling
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
A case-based technique for tracking concept drift in spam filtering
Knowledge-Based Systems
International Journal of Knowledge and Web Intelligence
PCA document reconstruction for email classification
Computational Statistics & Data Analysis
Streaming sparse matrix compression/decompression
HiPEAC'05 Proceedings of the First international conference on High Performance Embedded Architectures and Compilers
On effective e-mail classification via neural networks
DEXA'05 Proceedings of the 16th international conference on Database and Expert Systems Applications
User action based adaptive learning with weighted bayesian classification for filtering spam mail
AI'06 Proceedings of the 19th Australian joint conference on Artificial Intelligence: advances in Artificial Intelligence
Sprinkling: supervised latent semantic indexing
ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
SDAI: An integral evaluation methodology for content-based spam filtering models
Expert Systems with Applications: An International Journal
Grindstone4Spam: An optimization toolkit for boosting e-mail classification
Journal of Systems and Software
Genetic optimized artificial immune system in spam detection: a review and a model
Artificial Intelligence Review
Concept drift detection via competence models
Artificial Intelligence
Hi-index | 0.00 |
Past research has explored the effectiveness of a Naïve Bayesian classifier when filtering unsolicited bulk email (spam). Results have shown that the degree of precision of this approach is generally superior to the degree of recall. This study evaluates the effectiveness of a classifier incorporating Latent Semantic Indexing (LSI) to filter spam email on corpus used in previous studies. Results show that email classifiers using LSI to filter spam enjoy a very high degree of both recall and precision, no matter if the corpus is treated using a stop list or a lemmatizer. While using LSI leads to precision roughly equal to that of using a Naïve Bayesian approach, the LSI technique has a substantially higher recall and is more effective under certain conditions.Results show that incorporating LSI into an anti-spam filter is viable, particularly in implementations when misclassified legitimate messages are not arbitrarily deleted. Other inferences are drawn to the applicability of this method to other text mining tasks.