Spotting opinion spammers using behavioral footprints

Authors:
Arjun Mukherjee;Abhinav Kumar;Bing Liu;Junhui Wang;Meichun Hsu;Malu Castellanos;Riddhiman Ghosh
Affiliations:
University of Illinois at Chicago, Chicago, IL, USA;University of Illinois at Chicago, Chicago, IL, USA;University of Illinois at Chicago, Chicago, IL, USA;University of Illinois at Chicago, Chicago, IL, USA;HP Labs, Palo Alto, CA, USA;HP Labs, Palo Alto, CA, USA;HP Labs, Palo Alto, CA, USA
Venue:
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2013

Citing 30
Cited 0

Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization

ACM Transactions on Mathematical Software (TOMS)
Making large-scale support vector machine learning practical

Advances in kernel methods
Fusion Via a Linear Combination of Scores

Information Retrieval
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
An efficient boosting algorithm for combining preferences

The Journal of Machine Learning Research
MailRank: using ranking for spam detection

Proceedings of the 14th ACM international conference on Information and knowledge management
Topical TrustRank: using topicality to combat web spam

Proceedings of the 15th international conference on World Wide Web
Pattern Recognition and Machine Learning (Information Science and Statistics)

Pattern Recognition and Machine Learning (Information Science and Statistics)
A reference collection for web spam

ACM SIGIR Forum
Netprobe: a fast and scalable system for fraud detection in online auction networks

Proceedings of the 16th international conference on World Wide Web
Opinion spam and analysis

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
An Unsupervised Learning Algorithm for Rank Aggregation

ECML '07 Proceedings of the 18th European conference on Machine Learning
Detecting spam blogs: a machine learning approach

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Learning to Rank for Information Retrieval

Foundations and Trends in Information Retrieval
Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Large-scale bot detection for search engines

Proceedings of the 19th international conference on World wide web
iRANK: A rank-learn-combine framework for unsupervised ensemble ranking

Journal of the American Society for Information Science and Technology
Detecting product review spammers using rating behaviors

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Finding unusual review patterns using unexpected rules

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Detecting group review spam

Proceedings of the 20th international conference companion on World wide web
Finding deceptive opinion spam by any stretch of the imagination

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Review Graph Based Online Store Review Spammer Detection

ICDM '11 Proceedings of the 2011 IEEE 11th International Conference on Data Mining
Understanding and combating link farming in the twitter social network

Proceedings of the 21st international conference on World Wide Web
Spotting fake reviewer groups in consumer reviews

Proceedings of the 21st international conference on World Wide Web
Estimating the prevalence of deception in online review communities

Proceedings of the 21st international conference on World Wide Web
Survey on web spam detection: principles and algorithms

ACM SIGKDD Explorations Newsletter
Learning to identify review spam

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Review spam detection via temporal pattern discovery

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Syntactic stylometry for deception detection

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

Opinionated social media such as product reviews are now widely used by individuals and organizations for their decision making. However, due to the reason of profit or fame, people try to game the system by opinion spamming (e.g., writing fake reviews) to promote or to demote some target products. In recent years, fake review detection has attracted significant attention from both the business and research communities. However, due to the difficulty of human labeling needed for supervised learning and evaluation, the problem remains to be highly challenging. This work proposes a novel angle to the problem by modeling spamicity as latent. An unsupervised model, called Author Spamicity Model (ASM), is proposed. It works in the Bayesian setting, which facilitates modeling spamicity of authors as latent and allows us to exploit various observed behavioral footprints of reviewers. The intuition is that opinion spammers have different behavioral distributions than non-spammers. This creates a distributional divergence between the latent population distributions of two clusters: spammers and non-spammers. Model inference results in learning the population distributions of the two clusters. Several extensions of ASM are also considered leveraging from different priors. Experiments on a real-life Amazon review dataset demonstrate the effectiveness of the proposed models which significantly outperform the state-of-the-art competitors.