Finding deceptive opinion spam by any stretch of the imagination

Authors:
Myle Ott;Yejin Choi;Claire Cardie;Jeffrey T. Hancock
Affiliations:
Cornell University, Ithaca, NY;Cornell University, Ithaca, NY;Cornell University, Ithaca, NY;Cornell University, Ithaca, NY
Venue:
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Year:
2011

Citing 22
Cited 30

Making large-scale support vector machine learning practical

Advances in kernel methods
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
An empirical study of smoothing techniques for language modeling

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Accurate unlexicalized parsing

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Detecting spam web pages through content analysis

Proceedings of the 15th international conference on World Wide Web
A Comparison of Classification Methods for Predicting Deception in Computer-Mediated Communication

Journal of Management Information Systems
Combating web spam with trustrank

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Opinion spam and analysis

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
A Statistical Language Modeling Approach to Online Deception Detection

IEEE Transactions on Knowledge and Data Engineering
How opinions are received by online communities: a case study on amazon.com helpfulness votes

Proceedings of the 18th international conference on World wide web
Automatically assessing the post quality in online discussions on software

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Automatically assessing review helpfulness

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Using linguistic cues for the automatic recognition of personality in conversation and text

Journal of Artificial Intelligence Research
Learning to recommend helpful hotel reviews

Proceedings of the third ACM conference on Recommender systems
The lie detector: explorations in the automatic recognition of deceptive language

ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Estimating Labels from Label Proportions

The Journal of Machine Learning Research
An exploration of off topic conversation

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Amazon Mechanical Turk for subjectivity word sense disambiguation

CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
Detecting product review spammers using rating behaviors

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Apples-to-apples in cross-validation studies: pitfalls in classifier performance measurement

ACM SIGKDD Explorations Newsletter
Support vector machines for spam categorization

IEEE Transactions on Neural Networks

Detection of near-duplicate user generated contents: the SMS spam collection

Proceedings of the 3rd international workshop on Search and mining user-generated contents
Social transparency in networked information exchange: a theoretical framework

Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work
Identifying spam in the iOS app store

Proceedings of the 2nd Joint WICOW/AIRWeb Workshop on Web Quality
Spotting fake reviewer groups in consumer reviews

Proceedings of the 21st international conference on World Wide Web
Estimating the prevalence of deception in online review communities

Proceedings of the 21st international conference on World Wide Web
Serf and turf: crowdturfing for fun and profit

Proceedings of the 21st international conference on World Wide Web
Identify Online Store Review Spammers via Social Review Graph

ACM Transactions on Intelligent Systems and Technology (TIST)
Review spam detection via temporal pattern discovery

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Common Sense Reasoning for Detection, Prevention, and Mitigation of Cyberbullying

ACM Transactions on Interactive Intelligent Systems (TiiS) - Special Issue on Common Sense for Interactive Systems
Review quality aware collaborative filtering

Proceedings of the sixth ACM conference on Recommender systems
Stylometric analysis of scientific articles

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Seeing through deception: a computational approach to deceit detection in written communication

EACL 2012 Proceedings of the Workshop on Computational Approaches to Deception Detection
In search of a gold standard in studies of deception

EACL 2012 Proceedings of the Workshop on Computational Approaches to Deception Detection
Towards multimodal deception detection -- step 1: building a collection of deceptive videos

Proceedings of the 14th ACM international conference on Multimodal interaction
Communication Processes in Participatory Websites

Journal of Computer-Mediated Communication
Modeling review comments

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Syntactic stylometry for deception detection

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Streaming analysis of discourse participants

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Can we identify manipulative behavior and the corresponding suspects on review websites using supervised learning?

NordSec'12 Proceedings of the 17th Nordic conference on Secure IT Systems
The promise and peril of real-time corrections to political misperceptions

Proceedings of the 2013 conference on Computer supported cooperative work
Simultaneously detecting fake reviews and review spammers using factor graph model

Proceedings of the 5th Annual ACM Web Science Conference
Spotting opinion spammers using behavioral footprints

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Synthetic review spamming and defense

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Synthetic review spamming and defense

Proceedings of the 22nd international conference on World Wide Web companion
Iolaus: securing online content rating systems

Proceedings of the 22nd international conference on World Wide Web
Battling the internet water army: detection of hidden paid posters

Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Review spam detector with rating consistency check

Proceedings of the 51st ACM Southeast Conference
Detecting collusive spammers in online review communities

Proceedings of the sixth workshop on Ph.D. students in information and knowledge management
Automatic detection of deceit in verbal communication

Proceedings of the 15th ACM on International conference on multimodal interaction
A study of manipulative and authentic negative reviews

Proceedings of the 8th International Conference on Ubiquitous Information Management and Communication

Quantified Score

Hi-index	0.01

Visualization

Abstract

Consumers increasingly rate, review and research products online (Jansen, 2010; Litvin et al., 2008). Consequently, websites containing consumer reviews are becoming targets of opinion spam. While recent work has focused primarily on manually identifiable instances of opinion spam, in this work we study deceptive opinion spam---fictitious opinions that have been deliberately written to sound authentic. Integrating work from psychology and computational linguistics, we develop and compare three approaches to detecting deceptive opinion spam, and ultimately develop a classifier that is nearly 90% accurate on our gold-standard opinion spam dataset. Based on feature analysis of our learned models, we additionally make several theoretical contributions, including revealing a relationship between deceptive opinions and imaginative writing.