Lexical URL analysis for discriminating phishing and legitimate websites

Authors:
Mahmoud Khonji;Youssef Iraqi;Andrew Jones
Affiliations:
Khalifa University, Sharjah, UAE;Khalifa University, Sharjah, UAE;Khalifa University, Sharjah, UAE and Edith Cowan University
Venue:
Proceedings of the 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference
Year:
2011

Citing 5
Cited 1

Introduction to Machine Learning by Ethem Alpaydin, MIT Press, 0-262-01211-1, 400 pp., $50.00/£32.95

The Knowledge Engineering Review
Anti-Phishing Phil: the design and evaluation of a game that teaches people not to fall for phish

Proceedings of the 3rd symposium on Usable privacy and security
A framework for detection and measurement of phishing attacks

Proceedings of the 2007 ACM workshop on Recurring malcode
Fighting Phishing with Discriminative Keypoint Features

IEEE Internet Computing
An Evaluation of Users' Anti-Phishing Knowledge Retention

ICIME '09 Proceedings of the 2009 International Conference on Information Management and Engineering

Proactive discovery of phishing related domain names

RAID'12 Proceedings of the 15th international conference on Research in Attacks, Intrusions, and Defenses

Quantified Score

Hi-index	0.00

Visualization

Abstract

A study that aims to evaluate the practical effectiveness of website classification by lexically analyzing URL tokens in addition to a novel tokenization mechanism to increase prediction accuracy. The study analyzes over 70,000 legitimate and phishing URLs collected over 6 months period from PhishTank1, Khalifa University HTTP logs and volunteers using an experimental HTTP proxy server. A statistical classification model is then constructed and evaluated to measure the practical effectiveness of the lexical URL analysis presented in this paper.