Lexical URL analysis for discriminating phishing and legitimate websites

  • Authors:
  • Mahmoud Khonji;Youssef Iraqi;Andrew Jones

  • Affiliations:
  • Khalifa University, Sharjah, UAE;Khalifa University, Sharjah, UAE;Khalifa University, Sharjah, UAE and Edith Cowan University

  • Venue:
  • Proceedings of the 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

A study that aims to evaluate the practical effectiveness of website classification by lexically analyzing URL tokens in addition to a novel tokenization mechanism to increase prediction accuracy. The study analyzes over 70,000 legitimate and phishing URLs collected over 6 months period from PhishTank1, Khalifa University HTTP logs and volunteers using an experimental HTTP proxy server. A statistical classification model is then constructed and evaluated to measure the practical effectiveness of the lexical URL analysis presented in this paper.