Detecting malicious web links and identifying their attack types

Authors:
Hyunsang Choi;Bin B. Zhu;Heejo Lee
Affiliations:
Korea University, Seoul, Korea;Microsoft Research Asia, Beijing, China;Korea University, Seoul, Korea
Venue:
WebApps'11 Proceedings of the 2nd USENIX conference on Web application development
Year:
2011

Citing 28
Cited 3

C4.5: programs for machine learning

C4.5: programs for machine learning
Support-Vector Networks

Machine Learning
Making large-scale support vector machine learning practical

Advances in kernel methods
An Evaluation of Statistical Approaches to Text Categorization

Information Retrieval
Link spam alliances

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Detecting spam web pages through content analysis

Proceedings of the 15th international conference on World Wide Web
Understanding the network-level behavior of spammers

Proceedings of the 2006 conference on Applications, technologies, architectures, and protocols for computer communications
A reference collection for web spam

ACM SIGIR Forum
Detecting Link Spam Using Temporal Information

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
ML-KNN: A lazy learning approach to multi-label learning

Pattern Recognition
Cantina: a content-based approach to detecting phishing web sites

Proceedings of the 16th international conference on World Wide Web
Learning to detect phishing emails

Proceedings of the 16th international conference on World Wide Web
Know your neighbors: web spam detection using the web topology

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A framework for detection and measurement of phishing attacks

Proceedings of the 2007 ACM workshop on Recurring malcode
SpyProxy: execution-based detection of malicious web content

SS'07 Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium
Behind phishing: an examination of phisher modi operandi

LEET'08 Proceedings of the 1st Usenix Workshop on Large-Scale Exploits and Emergent Threats
Spamming botnets: signatures and characteristics

Proceedings of the ACM SIGCOMM 2008 conference on Data communication
All your iFRAMEs point to Us

SS'08 Proceedings of the 17th conference on Security symposium
Highly predictive blacklisting

SS'08 Proceedings of the 17th conference on Security symposium
A hybrid phish detection approach by identity discovery and keywords retrieval

Proceedings of the 18th international conference on World wide web
Identifying suspicious URLs: an application of large-scale online learning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Beyond blacklists: learning to detect malicious web sites from suspicious URLs

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Malicious web content detection by machine learning

Expert Systems with Applications: An International Journal
Identifying spam link generators for monitoring emerging web spam

Proceedings of the 4th workshop on Information credibility
Phishnet: predictive blacklisting to detect phishing attacks

INFOCOM'10 Proceedings of the 29th conference on Information communications
Uncovering social spammers: social honeypots + machine learning

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Temporal correlations between spam and phishing websites

LEET'09 Proceedings of the 2nd USENIX conference on Large-scale exploits and emergent threats: botnets, spyware, worms, and more
Random k-Labelsets for Multilabel Classification

IEEE Transactions on Knowledge and Data Engineering

Cross-layer detection of malicious websites

Proceedings of the third ACM conference on Data and application security and privacy
Cost-sensitive online active learning with application to malicious URL detection

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Effective analysis, characterization, and detection of malicious web pages

Proceedings of the 22nd international conference on World Wide Web companion

Quantified Score

Hi-index	0.00

Visualization

Abstract

Malicious URLs have been widely used to mount various cyber attacks including spamming, phishing and malware. Detection of malicious URLs and identification of threat types are critical to thwart these attacks. Knowing the type of a threat enables estimation of severity of the attack and helps adopt an effective countermeasure. Existing methods typically detect malicious URLs of a single attack type. In this paper, we propose method using machine learning to detect malicious URLs of all the popular attack types and identify the nature of attack a malicious URL attempts to launch. Our method uses a variety of discriminative features including textual properties, link structures, webpage contents, DNS information, and network traffic. Many of these features are novel and highly effective. Our experimental studies with 40,000 benign URLs and 32,000 malicious URLs obtained from real-life Internet sources show that our method delivers a superior performance: the accuracy was over 98% in detecting malicious URLs and over 93% in identifying attack types. We also report our studies on the effectiveness of each group of discriminative features, and discuss their evadability.