A comparison of fraud cues and classification methods for fake escrow website detection

Authors:
Ahmed Abbasi;Hsinchun Chen
Affiliations:
Sheldon B. Lubar School of Business, University of Wisconsin-Milwaukee, Milwaukee, USA 53201;Artificial Intelligence Lab, Department of Management Information Systems, Eller College of Management, University of Arizona, Tucson, USA 85721
Venue:
Information Technology and Management
Year:
2009

Citing 30
Cited 4

Automated learning of decision rules for text categorization

ACM Transactions on Information Systems (TOIS)
The nature of statistical learning theory

The nature of statistical learning theory
A Study of Approaches to Hypertext Categorization

Journal of Intelligent Information Systems
Induction of Decision Trees

Machine Learning
Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm

Machine Learning
Composite Kernels for Hypertext Categorisation

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Focused Crawling Using Context Graphs

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Text categorization based on k-nearest neighbor approach for web site classification

Information Processing and Management: an International Journal
Web site mining: a new way to spot competitors, customers and suppliers in the world wide web

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Building trust in online auction markets through an economic incentive mechanism

Decision Support Systems
Is Combining Classifiers with Stacking Better than Selecting the Best One?

Machine Learning
The Digitization of Word of Mouth: Promise and Challenges of Online Feedback Mechanisms

Management Science
Building Effective Online Marketplaces with Institution-Based Trust

Information Systems Research
Spam, damn spam, and statistics: using statistical analysis to locate spam web pages

Proceedings of the 7th International Workshop on the Web and Databases: colocated with ACM SIGMOD/PODS 2004
Topical web crawlers: Evaluating adaptive algorithms

ACM Transactions on Internet Technology (TOIT)
Fighting Internet Auction Fraud: An Assessment and Proposal

Computer
Electronic Commerce Fraud: Towards an Understanding of the Phenomenon

HICSS '05 Proceedings of the Proceedings of the 38th Annual Hawaii International Conference on System Sciences - Volume 07
Hope or Hype: On the Viability of Escrow Services as Trusted Third Parties in Online Auction Environments

Information Systems Research
Internet Users' Information Privacy Concerns (IUIPC): The Construct, the Scale, and a Causal Model

Information Systems Research
Spam: It's Not Just for Inboxes Anymore

Computer
Applying Authorship Analysis to Extremist-Group Web Forum Messages

IEEE Intelligent Systems
A framework for authorship identification of online messages: Writing-style features and classification techniques

Journal of the American Society for Information Science and Technology
From fingerprint to writeprint

Communications of the ACM - Supporting exploratory search
Detecting spam web pages through content analysis

Proceedings of the 15th international conference on World Wide Web
Detecting semantic cloaking on the web

Proceedings of the 15th international conference on World Wide Web
A survey of trust and reputation systems for online service provision

Decision Support Systems
Psychological Contract Violation in Online Marketplaces: Antecedents, Consequences, and Moderating Role

Information Systems Research
Writeprints: A stylometric approach to identity-level identification and similarity detection in cyberspace

ACM Transactions on Information Systems (TOIS)
Weblog classification for fast splog filtering: a URL language model segmentation approach

NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
CyberGate: a design framework and system for text analysis of computer-mediated communication

MIS Quarterly

Detecting Fake Medical Web Sites Using Recursive Trust Labeling

ACM Transactions on Information Systems (TOIS)
Metafraud: a meta-learning framework for detecting financial fraud

MIS Quarterly
A comparative analysis of classification algorithms in data mining for accuracy, speed and robustness

Information Technology and Management
Analyzing sentiments in Web 2.0 social media data in Chinese: experiments on business and marketing related Chinese Web forums

Information Technology and Management

Quantified Score

Hi-index	0.00

Visualization

Abstract

The ability to automatically detect fraudulent escrow websites is important in order to alleviate online auction fraud. Despite research on related topics, such as web spam and spoof site detection, fake escrow website categorization has received little attention. The authentic appearance of fake escrow websites makes it difficult for Internet users to differentiate legitimate sites from phonies; making systems for detecting such websites an important endeavor. In this study we evaluated the effectiveness of various features and techniques for detecting fake escrow websites. Our analysis included a rich set of fraud cues extracted from web page text, image, and link information. We also compared several machine learning algorithms, including support vector machines, neural networks, decision trees, naïve bayes, and principal component analysis. Experiments were conducted to assess the proposed fraud cues and techniques on a test bed encompassing nearly 90,000 web pages derived from 410 legitimate and fake escrow websites. The combination of an extended feature set and a support vector machines ensemble classifier enabled accuracies over 90 and 96% for page and site level classification, respectively, when differentiating fake pages from real ones. Deeper analysis revealed that an extended set of fraud cues is necessary due to the broad spectrum of tactics employed by fraudsters. The study confirms the feasibility of using automated methods for detecting fake escrow websites. The results may also be useful for informing existing online escrow fraud resources and communities of practice about the plethora of fraud cues pervasive in fake websites.