Learning from positive and unlabeled amazon reviews: towards identifying trustworthy reviewers

Authors:
Marios Kokkodis
Affiliations:
New York University, New York, NY, USA
Venue:
Proceedings of the 21st international conference companion on World Wide Web
Year:
2012

Citing 7
Cited 0

Building Text Classifiers Using Positive and Unlabeled Examples

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Learning classifiers from only positive and unlabeled data

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
How opinions are received by online communities: a case study on amazon.com helpfulness votes

Proceedings of the 18th international conference on World wide web
Audience selection for on-line brand advertising: privacy-friendly social network targeting

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Automatically assessing review helpfulness

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Exploiting social context for review quality prediction

Proceedings of the 19th international conference on World wide web
Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics

IEEE Transactions on Knowledge and Data Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

On-line marketplaces have been growing in importance over the last few years. In such environments, reviews consist the main reputation mechanism for the available products. Hence, presenting high quality reviews is crucial in achieving a high level of customer satisfaction. Towards this direction, in this work, we introduce a new dimension of review quality, the reviewer's "trustfulness". We assume that voluntary information provided by Amazon reviewers, regarding whether they are the actual buyers of the product, signals the reliability of a review. Based on this information, we characterize a reviewer as trustworthy (positive instance) or of unknown "trustfulness" (unlabeled instance). Then, we build models that exploit reviewers' profile information and on-line behavior to rank them according to the probability of being trustworthy. Our results are very promising, since they provide evidence that our predictive models separate positive from unlabeled instances with very high accuracies.