Learning Attitudes and Attributes from Multi-aspect Reviews

Authors:
Julian McAuley;Jure Leskovec;Dan Jurafsky
Affiliations:
-;-;-
Venue:
ICDM '12 Proceedings of the 2012 IEEE 12th International Conference on Data Mining
Year:
2012

Citing 0
Cited 4

No country for old members: user lifecycle and linguistic change in online communities

Proceedings of the 22nd international conference on World Wide Web
From amateurs to connoisseurs: modeling the evolution of user expertise through online reviews

Proceedings of the 22nd international conference on World Wide Web
Hidden factors and hidden topics: understanding rating dimensions with review text

Proceedings of the 7th ACM conference on Recommender systems
CoBaFi: collaborative bayesian filtering

Proceedings of the 23rd international conference on World wide web

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most online reviews consist of plain-text feedback together with a single numeric score. However, understanding the multiple `aspects' that contribute to users' ratings may help us to better understand their individual preferences. For example, a user's impression of an audio book presumably depends on aspects such as the story and the narrator, and knowing their opinions on these aspects may help us to recommend better products. In this paper, we build models for rating systems in which such dimensions are explicit, in the sense that users leave separate ratings for each aspect of a product. By introducing new corpora consisting of five million reviews, rated with between three and six aspects, we evaluate our models on three prediction tasks: First, we uncover which parts of a review discuss which of the rated aspects. Second, we summarize reviews by finding the sentences that best explain a user's rating. Finally, since aspect ratings are optional in many of the datasets we consider, we recover ratings that are missing from a user's evaluation. Our model matches state-of-the-art approaches on existing small-scale datasets, while scaling to the real-world datasets we introduce. Moreover, our model is able to `disentangle' content and sentiment words: we automatically learn content words that are indicative of a particular aspect as well as the aspect-specific sentiment words that are indicative of a particular rating.