Constrained LDA for grouping product features in opinion mining

  • Authors:
  • Zhongwu Zhai;Bing Liu;Hua Xu;Peifa Jia

  • Affiliations:
  • State Key Lab of Intelligent Tech. & Sys., Dept. of Comp. Sci. & Tech., Tsinghua Univ.;Dept. of Comp. Sci., University of Illinois at Chicago;State Key Lab of Intelligent Tech. & Sys., Dept. of Comp. Sci. & Tech., Tsinghua Univ.;State Key Lab of Intelligent Tech. & Sys., Dept. of Comp. Sci. & Tech., Tsinghua Univ.

  • Venue:
  • PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
  • Year:
  • 2011

Quantified Score

Hi-index 0.01

Visualization

Abstract

In opinion mining of product reviews, one often wants to produce a summary of opinions based on product features. However, for the same feature, people can express it with different words and phrases. To produce an effective summary, these words and phrases, which are domain synonyms, need to be grouped under the same feature. Topic modeling is a suitable method for the task. However, instead of simply letting topic modeling find groupings freely, we believe it is possible to do better by giving it some pre-existing knowledge in the form of automatically extracted constraints. In this paper, we first extend a popular topic modeling method, called Latent Dirichlet Allocation (LDA), with the ability to process large scale constraints. Then, two novel methods are proposed to extract two types of constraints automatically. Finally, the resulting constrained-LDA and the extracted constraints are applied to group product features. Experiments show that constrained-LDA outperforms the original LDA and the latest mLSA by a large margin.