Using collaborative filtering to weave an information tapestry
Communications of the ACM - Special issue on information filtering
GroupLens: an open architecture for collaborative filtering of netnews
CSCW '94 Proceedings of the 1994 ACM conference on Computer supported cooperative work
Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A Tutorial on Support Vector Machines for Pattern Recognition
Data Mining and Knowledge Discovery
The Journal of Machine Learning Research
Predicting the semantic orientation of adjectives
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Mining and summarizing customer reviews
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
IEEE Transactions on Knowledge and Data Engineering
Thumbs up?: sentiment classification using machine learning techniques
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Movie review mining and summarization
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Utility scoring of product reviews
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Learning Bayesian Networks
Multi-aspect expertise matching for review assignment
Proceedings of the 17th ACM conference on Information and knowledge management
Modeling and Predicting the Helpfulness of Online Reviews
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Rated aspect summarization of short comments
Proceedings of the 18th international conference on World wide web
Linked latent Dirichlet allocation in web spam filtering
Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web
Automatically assessing the post quality in online discussions on software
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Automatically assessing review helpfulness
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Latent dirichlet allocation for tag recommendation
Proceedings of the third ACM conference on Recommender systems
A General Framework for Web Content Filtering
World Wide Web
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
Probabilistic latent semantic analysis
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
A study of homophily on social media
World Wide Web
Recommender systems from "words of few mouths"
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Hi-index | 0.00 |
This paper identifies a widely existing phenomenon in social media content, which we call the "words of few mouths" phenomenon. This phenomenon challenges the development of recommender systems based on users' online opinions by presenting additional sources of uncertainty. In the context of predicting the "helpfulness" of a review document based on users' online votes on other reviews (where a user's vote on a review is either HELPFUL or UNHELPFUL), the "words of few mouths" phenomenon corresponds to the case where a large fraction of the reviews are each voted only by very few users. Focusing on the "review helpfulness prediction" problem, we illustrate the challenges associated with the "words of few mouths" phenomenon in the training of a review helpfulness predictor. We advocate probabilistic approaches for recommender system development in the presence of "words of few mouths". More concretely, we propose a probabilistic metric as the training target for conventional machine learning based predictors. Our empirical study using Support Vector Regression (SVR) augmented with the proposed probability metric demonstrates advantages of incorporating probabilistic methods in the training of the predictors. In addition to this "partially probabilistic" approach, we also develop a logistic regression based probabilistic model and correspondingly a learning algorithm for review helpfulness prediction. We demonstrate experimentally the superior performance of the logistic regression method over SVR, the prior art in review helpfulness prediction.