The bag-of-opinions method for review rating prediction from sparse text patterns

Authors:
Lizhen Qu;Georgiana Ifrim;Gerhard Weikum
Affiliations:
Max-Planck Institute for Informatics;Bioinformatics Research Centre;Max-Planck Institute for Informatics
Venue:
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Year:
2010

Citing 16
Cited 5

On the convergence of the coordinate descent method for convex differentiable minimization

Journal of Optimization Theory and Applications
A study of smoothing methods for language models applied to information retrieval

ACM Transactions on Information Systems (TOIS)
Mining and summarizing customer reviews

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Determining the sentiment of opinions

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Extracting product features and opinions from reviews

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Recognizing contextual polarity in phrase-level sentiment analysis

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Modeling online reviews with multi-grain topic models

Proceedings of the 17th international conference on World Wide Web
Fast logistic regression for text categorization with variable-length n-grams

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Opinion Mining and Sentiment Analysis

Foundations and Trends in Information Retrieval
Multi-facet Rating of Product Reviews

ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Sentiment summarization: evaluating and learning user preferences

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Fully automatic lexicon expansion for domain-oriented sentiment analysis

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Seeing stars when there aren't many stars: graph-based semi-supervised learning for sentiment categorization

TextGraphs-1 Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing
Expanding domain sentiment lexicon through double propagation

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Review sentiment scoring via a parse-and-paraphrase paradigm

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1

Incorporating reviewer and product information for review rating prediction

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Can text summaries help predict ratings? a case study of movie reviews

NLDB'12 Proceedings of the 17th international conference on Applications of Natural Language Processing and Information Systems
A weakly supervised model for sentence-level semantic orientation analysis with multiple experts

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Generating contextualized sentiment lexica based on latent topics and user ratings

Proceedings of the 24th ACM Conference on Hypertext and Social Media
Review rating prediction based on the content and weighting strong social relation of reviewers

Proceedings of the 2013 international workshop on Mining unstructured big data using natural language processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The problem addressed in this paper is to predict a user's numeric rating in a product review from the text of the review. Unigram and n-gram representations of text are common choices in opinion mining. However, unigrams cannot capture important expressions like "could have been better", which are essential for prediction models of ratings. N-grams of words, on the other hand, capture such phrases, but typically occur too sparsely in the training set and thus fail to yield robust predictors. This paper overcomes the limitations of these two models, by introducing a novel kind of bag-of-opinions representation, where an opinion, within a review, consists of three components: a root word, a set of modifier words from the same sentence, and one or more negation words. Each opinion is assigned a numeric score which is learned, by ridge regression, from a large, domain-independent corpus of reviews. For the actual test case of a domain-dependent review, the review's rating is predicted by aggregating the scores of all opinions in the review and combining it with a domain-dependent unigram model. The paper presents a constrained ridge regression algorithm for learning opinion scores. Experiments show that the bag-of-opinions method outperforms prior state-of-the-art techniques for review rating prediction.