Intelligent phishing detection and protection scheme for online transactions

  • Authors:
  • P. A. Barraclough;M. A. Hossain;M. A. Tahir;G. Sexton;N. Aslam

  • Affiliations:
  • Computational Intelligence Group, University of Northumbria at Newcastle, Newcastle Upon Tyne NE1, United Kingdom;Computational Intelligence Group, University of Northumbria at Newcastle, Newcastle Upon Tyne NE1, United Kingdom;College of Computing and Information Sciences, Al-Imam Mohammad Ibn Saud Islamic University, Riyadh, 11432, Saudi Arabia;Computational Intelligence Group, University of Northumbria at Newcastle, Newcastle Upon Tyne NE1, United Kingdom;Computational Intelligence Group, University of Northumbria at Newcastle, Newcastle Upon Tyne NE1, United Kingdom

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2013

Quantified Score

Hi-index 12.05

Visualization

Abstract

Phishing is an instance of social engineering techniques used to deceive users into giving their sensitive information using an illegitimate website that looks and feels exactly like the target organization website. Most phishing detection approaches utilizes Uniform Resource Locator (URL) blacklists or phishing website features combined with machine learning techniques to combat phishing. Despite the existing approaches that utilize URL blacklists, they cannot generalize well with new phishing attacks due to human weakness in verifying blacklists, while the existing feature-based methods suffer high false positive rates and insufficient phishing features. As a result, this leads to an inadequacy in the online transactions. To solve this problem robustly, the proposed study introduces new inputs (Legitimate site rules, User-behavior profile, PhishTank, User-specific sites, Pop-Ups from emails) which were not considered previously in a single protection platform. The idea is to utilize a Neuro-Fuzzy Scheme with 5 inputs to detect phishing sites with high accuracy in real-time. In this study, 2-Fold cross-validation is applied for training and testing the proposed model. A total of 288 features with 5 inputs were used and has so far achieved the best performance as compared to all previously reported results in the field.