Can we identify manipulative behavior and the corresponding suspects on review websites using supervised learning?

Authors:
Huiying Duan;Cäcilia Zirn
Affiliations:
Heidelberg Institute for Theoretical Studies gGmbH, Heidelberg, Germany;KR & KM Research Group, University of Mannheim, Mannheim, Germany
Venue:
NordSec'12 Proceedings of the 17th Nordic conference on Secure IT Systems
Year:
2012

Citing 12
Cited 0

Gene Selection for Cancer Classification using Support Vector Machines

Machine Learning
Opinion spam and analysis

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Learning to recommend helpful hotel reviews

Proceedings of the third ACM conference on Recommender systems
Merging multiple criteria to identify suspicious reviews

Proceedings of the fourth ACM conference on Recommender systems
Detecting product review spammers using rating behaviors

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Finding unusual review patterns using unexpected rules

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Apples-to-apples in cross-validation studies: pitfalls in classifier performance measurement

ACM SIGKDD Explorations Newsletter
Distortion as a validation criterion in the identification of suspicious reviews

Proceedings of the First Workshop on Social Media Analytics
Finding deceptive opinion spam by any stretch of the imagination

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Text mining and probabilistic language modeling for online review spam detection

ACM Transactions on Management Information Systems (TMIS)
Estimating the prevalence of deception in online review communities

Proceedings of the 21st international conference on World Wide Web
Building robust Reputation Systems for travel-related services

PST '12 Proceedings of the 2012 Tenth Annual International Conference on Privacy, Security and Trust (PST)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Identification of manipulative behavior and the corresponding suspects is an essential task for maintaining robustness of reputation systems integrated by review websites. However, this task constitutes a great challenge. In this paper, we present an approach based on supervised learning to automatically detect suspicious behavior on travel websites. We distinguish between two types of manipulation, treating them as separate tasks: promoting manipulation, which is performed in order to push the reputation of a hotel, and demoting manipulation, which is used to demote competitors. Both tasks consist of three separate levels: detecting suspicious reviews (review level), suspicious reviewers (reviewer level) and suspicious objects of the reviews, i.e. hotels (object level). A separate classifier for each of the levels is trained on various sets of textual and non-textual features. We apply state-of-the-art machine learning algorithms like Support Vector Machines. The performance of our approach is evaluated on a new dataset that we created based on reviews taken from the platform TripAdvisor and which was carefully annotated by human judges. The results show that it is possible to identify manipulating reviewers and objects of manipulation with over 90% accuracy. Identifying suspicious reviews, however, seems to be a much harder task, for which our classifier achieves an accuracy of 68% detecting promoting manipulation and 84% detecting demoting manipulation. We argue that there is the need to identify more efficient features for the classification on review level. Finally, we analyze and discuss statistical characteristics of manipulative behavior based on the predictions of the reviewer and object level classifiers.