On-line learning and stochastic approximations
On-line learning in neural networks
Cost-Sensitive Learning by Cost-Proportionate Example Weighting
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Fast webpage classification using URL features
Proceedings of the 14th ACM international conference on Information and knowledge management
Cantina: a content-based approach to detecting phishing web sites
Proceedings of the 16th international conference on World Wide Web
Learning to detect phishing emails
Proceedings of the 16th international conference on World Wide Web
Online Passive-Aggressive Algorithms
The Journal of Machine Learning Research
A comparison of machine learning techniques for phishing detection
Proceedings of the anti-phishing working groups 2nd annual eCrime researchers summit
A framework for detection and measurement of phishing attacks
Proceedings of the 2007 ACM workshop on Recurring malcode
The Forgetron: A Kernel-Based Perceptron on a Budget
SIAM Journal on Computing
SpyProxy: execution-based detection of malicious web content
SS'07 Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium
Behind phishing: an examination of phisher modi operandi
LEET'08 Proceedings of the 1st Usenix Workshop on Large-Scale Exploits and Emergent Threats
Confidence-weighted linear classification
Proceedings of the 25th international conference on Machine learning
The projectron: a bounded kernel-based Perceptron
Proceedings of the 25th international conference on Machine learning
LIBLINEAR: A Library for Large Linear Classification
The Journal of Machine Learning Research
SS'08 Proceedings of the 17th conference on Security symposium
Identifying suspicious URLs: an application of large-scale online learning
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Beyond blacklists: learning to detect malicious web sites from suspicious URLs
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Cost-sensitive online active learning with application to malicious URL detection
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Anatomy of drive-by download attack
AISC '13 Proceedings of the Eleventh Australasian Information Security Conference - Volume 138
Hi-index | 0.00 |
Malicious Web sites are a cornerstone of Internet criminal activities. The dangers of these sites have created a demand for safeguards that protect end-users from visiting them. This article explores how to detect malicious Web sites from the lexical and host-based features of their URLs. We show that this problem lends itself naturally to modern algorithms for online learning. Online algorithms not only process large numbers of URLs more efficiently than batch algorithms, they also adapt more quickly to new features in the continuously evolving distribution of malicious URLs. We develop a real-time system for gathering URL features and pair it with a real-time feed of labeled URLs from a large Web mail provider. From these features and labels, we are able to train an online classifier that detects malicious Web sites with 99% accuracy over a balanced dataset.