C4.5: programs for machine learning
C4.5: programs for machine learning
Machine Learning
Making large-scale support vector machine learning practical
Advances in kernel methods
An Evaluation of Statistical Approaches to Text Categorization
Information Retrieval
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Detecting spam web pages through content analysis
Proceedings of the 15th international conference on World Wide Web
Understanding the network-level behavior of spammers
Proceedings of the 2006 conference on Applications, technologies, architectures, and protocols for computer communications
A reference collection for web spam
ACM SIGIR Forum
Detecting Link Spam Using Temporal Information
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
ML-KNN: A lazy learning approach to multi-label learning
Pattern Recognition
Cantina: a content-based approach to detecting phishing web sites
Proceedings of the 16th international conference on World Wide Web
Learning to detect phishing emails
Proceedings of the 16th international conference on World Wide Web
Know your neighbors: web spam detection using the web topology
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A framework for detection and measurement of phishing attacks
Proceedings of the 2007 ACM workshop on Recurring malcode
SpyProxy: execution-based detection of malicious web content
SS'07 Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium
Behind phishing: an examination of phisher modi operandi
LEET'08 Proceedings of the 1st Usenix Workshop on Large-Scale Exploits and Emergent Threats
Spamming botnets: signatures and characteristics
Proceedings of the ACM SIGCOMM 2008 conference on Data communication
SS'08 Proceedings of the 17th conference on Security symposium
Highly predictive blacklisting
SS'08 Proceedings of the 17th conference on Security symposium
A hybrid phish detection approach by identity discovery and keywords retrieval
Proceedings of the 18th international conference on World wide web
Identifying suspicious URLs: an application of large-scale online learning
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Beyond blacklists: learning to detect malicious web sites from suspicious URLs
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Malicious web content detection by machine learning
Expert Systems with Applications: An International Journal
Identifying spam link generators for monitoring emerging web spam
Proceedings of the 4th workshop on Information credibility
Phishnet: predictive blacklisting to detect phishing attacks
INFOCOM'10 Proceedings of the 29th conference on Information communications
Uncovering social spammers: social honeypots + machine learning
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Temporal correlations between spam and phishing websites
LEET'09 Proceedings of the 2nd USENIX conference on Large-scale exploits and emergent threats: botnets, spyware, worms, and more
Random k-Labelsets for Multilabel Classification
IEEE Transactions on Knowledge and Data Engineering
Cross-layer detection of malicious websites
Proceedings of the third ACM conference on Data and application security and privacy
Cost-sensitive online active learning with application to malicious URL detection
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Effective analysis, characterization, and detection of malicious web pages
Proceedings of the 22nd international conference on World Wide Web companion
Hi-index | 0.00 |
Malicious URLs have been widely used to mount various cyber attacks including spamming, phishing and malware. Detection of malicious URLs and identification of threat types are critical to thwart these attacks. Knowing the type of a threat enables estimation of severity of the attack and helps adopt an effective countermeasure. Existing methods typically detect malicious URLs of a single attack type. In this paper, we propose method using machine learning to detect malicious URLs of all the popular attack types and identify the nature of attack a malicious URL attempts to launch. Our method uses a variety of discriminative features including textual properties, link structures, webpage contents, DNS information, and network traffic. Many of these features are novel and highly effective. Our experimental studies with 40,000 benign URLs and 32,000 malicious URLs obtained from real-life Internet sources show that our method delivers a superior performance: the accuracy was over 98% in detecting malicious URLs and over 93% in identifying attack types. We also report our studies on the effectiveness of each group of discriminative features, and discuss their evadability.