The Journal of Machine Learning Research
Modeling online reviews with multi-grain topic models
Proceedings of the 17th international conference on World Wide Web
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Latent Dirichlet Allocation with topic-in-set knowledge
SemiSupLearn '09 Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing
Joint sentiment/topic model for sentiment analysis
Proceedings of the 18th ACM conference on Information and knowledge management
Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
IEEE Transactions on Knowledge and Data Engineering
Holistic sentiment analysis across languages: multilingual supervised latent Dirichlet allocation
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Aspect and sentiment unification model for online review analysis
Proceedings of the fourth ACM international conference on Web search and data mining
Partially labeled topic models for interpretable text mining
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Towards jointly extracting aspects and aspect-specific sentiment knowledge
Proceedings of the 21st ACM international conference on Information and knowledge management
Hi-index | 0.00 |
We propose a domain-dependent/independent topic switching model based on Bayesian probabilistic modeling for modeling online product reviews that are accompanied with numerical ratings provided by users. In this model, each word is allocated to a domain-dependent topic or a domain-independent topic, and the distribution of topics in an online review is connected to an observed numerical rating via a linear regression model. Domain-dependent topics utilize domain information observed with a corpus, and domain-independent topics utilize the framework of Bayesian Nonparametrics, which can estimate the number of topics in posterior distributions. The posterior distribution is estimated via collapsed Gibbs sampling. Using real data, our proposed model had smaller mean square error and smaller average mean error with a small model size and achieved convergence in fewer iterations for a regression task involving online review ratings, outperforming a baseline model that did not consider domains. Moreover, the proposed model can also tell us whether the words are positive or negative in the form of continuous values. This feature allows us to extract domain-dependent and -independent sentiment words.