Entity-centric topic-oriented opinion summarization in twitter

  • Authors:
  • Xinfan Meng;Furu Wei;Xiaohua Liu;Ming Zhou;Sujian Li;Houfeng Wang

  • Affiliations:
  • Peking University, Beijing, China;Microsoft Research Asia, Beijing, China;Microsoft Research Asia, Beijing, China;Microsoft Research Asia, Beijing, China;Peking University, Beijing, China;Peking University, Beijing, China

  • Venue:
  • Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Microblogging services, such as Twitter, have become popular channels for people to express their opinions towards a broad range of topics. Twitter generates a huge volume of instant messages (i.e. tweets) carrying users' sentiments and attitudes every minute, which both necessitates automatic opinion summarization and poses great challenges to the summarization system. In this paper, we study the problem of opinion summarization for entities, such as celebrities and brands, in Twitter. We propose an entity-centric topic-based opinion summarization framework, which aims to produce opinion summaries in accordance with topics and remarkably emphasizing the insight behind the opinions. To this end, we first mine topics from #hashtags, the human-annotated semantic tags in tweets. We integrate the #hashtags as weakly supervised information into topic modeling algorithms to obtain better interpretation and representation for calculating the similarity among them, and adopt Affinity Propagation algorithm to group #hashtags into coherent topics. Subsequently, we use templates generalized from paraphrasing to identify tweets with deep insights, which reveal reasons, express demands or reflect viewpoints. Afterwards, we develop a target (i.e. entity) dependent sentiment classification approach to identifying the opinion towards a given target (i.e. entity) of tweets. Finally, the opinion summary is generated through integrating information from dimensions of topic, opinion and insight, as well as other factors (e.g. topic relevancy, redundancy and language styles) in an unified optimization framework. We conduct extensive experiments on a real-life data set to evaluate the performance of individual opinion summarization modules as well as the quality of the produced summary. The promising experiment results show the effectiveness of the proposed framework and algorithms.