Phishing detection with popular search engines: simple and effective

Authors:
Jun Ho Huh;Hyoungshick Kim
Affiliations:
Information Trust Institute, University of Illinois at Urbana-Champaign;Computer Laboratory, University of Cambridge, UK
Venue:
FPS'11 Proceedings of the 4th Canada-France MITACS conference on Foundations and Practice of Security
Year:
2011

Citing 17
Cited 0

Introduction to statistical pattern recognition (2nd ed.)

Introduction to statistical pattern recognition (2nd ed.)
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss

Machine Learning - Special issue on learning with probabilistic representations
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Locally Adaptive Metric Nearest-Neighbor Classification

IEEE Transactions on Pattern Analysis and Machine Intelligence
Support Vector Machines

IEEE Intelligent Systems
Toward an Ecplanatory Similarity Measure for Nearest-Neighbor Classification

ECML '00 Proceedings of the 11th European Conference on Machine Learning
Protecting Users Against Phishing Attacks with AntiPhish

COMPSAC '05 Proceedings of the 29th Annual International Computer Software and Applications Conference - Volume 01
An Antiphishing Strategy Based on Visual Similarity Assessment

IEEE Internet Computing
Detecting Phishing Web Pages with Visual Similarity Assessment Based on Earth Mover's Distance (EMD)

IEEE Transactions on Dependable and Secure Computing
Anomaly Based Web Phishing Page Detection

ACSAC '06 Proceedings of the 22nd Annual Computer Security Applications Conference
Cantina: a content-based approach to detecting phishing web sites

Proceedings of the 16th international conference on World Wide Web
Examining the impact of website take-down on phishing

Proceedings of the anti-phishing working groups 2nd annual eCrime researchers summit
Top 10 algorithms in data mining

Knowledge and Information Systems
Itrustpage: a user-assisted anti-phishing tool

Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems 2008
Anti-phishing based on automated individual white-list

Proceedings of the 4th ACM workshop on Digital identity management
A hybrid phish detection approach by identity discovery and keywords retrieval

Proceedings of the 18th international conference on World wide web
Evaluation of Online Resources in Assisting Phishing Detection

SAINT '09 Proceedings of the 2009 Ninth Annual International Symposium on Applications and the Internet

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a new phishing detection heuristic based on the search results returned from popular web search engines such as Google, Bing and Yahoo. The full URL of a website a user intends to access is used as the search string, and the number of results returned and ranking of the website are used for classification. Most of the time, legitimate websites get back large number of results and are ranked first, whereas phishing websites get back no result and/or are not ranked at all. To demonstrate the effectiveness of our approach, we experimented with four well-known classification algorithms --- Linear Discriminant Analysis, Naïve Bayesian, K -Nearest Neighbour, and Support Vector Machine --- and observed their performance. The K -Nearest Neighbour algorithm performed best, achieving true positive rate of 98% and false positive and false negative rates of 2%. We used new legitimate websites and phishing websites as our dataset to show that our approach works well even on newly launched websites/webpages --- such websites are often misclassified in existing blacklisting and whitelisting approaches.