Opinion spam and analysis

  • Authors:
  • Nitin Jindal;Bing Liu

  • Affiliations:
  • University of Illinois at Chicago, Chicago, IL;University of Illinois at Chicago, Chicago, IL

  • Venue:
  • WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

Evaluative texts on the Web have become a valuable source of opinions on products, services, events, individuals, etc. Recently, many researchers have studied such opinion sources as product reviews, forum posts, and blogs. However, existing research has been focused on classification and summarization of opinions using natural language processing and data mining techniques. An important issue that has been neglected so far is opinion spam or trustworthiness of online opinions. In this paper, we study this issue in the context of product reviews, which are opinion rich and are widely used by consumers and product manufacturers. In the past two years, several startup companies also appeared which aggregate opinions from product reviews. It is thus high time to study spam in reviews. To the best of our knowledge, there is still no published study on this topic, although Web spam and email spam have been investigated extensively. We will see that opinion spam is quite different from Web spam and email spam, and thus requires different detection techniques. Based on the analysis of 5.8 million reviews and 2.14 million reviewers from amazon.com, we show that opinion spam in reviews is widespread. This paper analyzes such spam activities and presents some novel techniques to detect them