Overcoming the J-shaped distribution of product reviews

  • Authors:
  • Nan Hu;Jie Zhang;Paul A. Pavlou

  • Affiliations:
  • Singapore Management University in Singapore;University of Texas at Arlington, Arlington, TX;Fox School of Business Temple University, PA

  • Venue:
  • Communications of the ACM - A View of Parallel Computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Introduction While product review systems that collect and disseminate opinions about products from recent buyers (Table 1) are valuable forms of word-of-mouth communication, evidence suggests that they are overwhelmingly positive. Kadet notes that most products receive almost five stars. Chevalier and Mayzlin also show that book reviews on Amazon and Barnes & Noble are overwhelmingly positive. Is this because all products are simply outstanding? However, a graphical representation of product reviews reveals a J-shaped distribution (Figure 1) with mostly 5-star ratings, some 1-star ratings, and hardly any ratings in between. What explains this J-shaped distribution? If products are indeed outstanding, why do we also see many 1-star ratings? Why aren't there any product ratings in between? Is it because there are no "average" products? Or, is it because there are biases in product review systems? If so, how can we overcome them? The J-shaped distribution also creates some fundamental statistical problems. Conventional wisdom assumes that the average of the product ratings is a sufficient proxy of product quality and product sales. Many studies used the average of product ratings to predict sales. However, these studies showed inconsistent results: some found product reviews to influence product sales, while others did not. The average is statistically meaningful only when it is based on a unimodal distribution, or when it is based on a symmetric bimodal distribution. However, since product review systems have an asymmetric bimodal (J-shaped) distribution, the average is a poor proxy of product quality. This report aims to first demonstrate the existence of a J-shaped distribution, second to identify the sources of bias that cause the J-shaped distribution, third to propose ways to overcome these biases, and finally to show that overcoming these biases helps product review systems better predict future product sales. We tested the distribution of product ratings for three product categories (books, DVDs, videos) with data from Amazon collected between February--July 2005: 78%, 73%, and 72% of the product ratings for books, DVDs, and videos are greater or equal to four stars (Figure 1), confirming our proposition that product reviews are overwhelmingly positive. Figure 1 (left graph) shows a J-shaped distribution of all products. This contradicts the law of "large numbers" that would imply a normal distribution. Figure 1 (middle graph) shows the distribution of three randomly-selected products in each category with over 2,000 reviews. The results show that these reviews still have a J-shaped distribution, implying that the J-shaped distribution is not due to a "small number" problem. Figure 1 (right graph) shows that even products with a median average review (around 3-stars) follow the same pattern.