Extracting common emotions from blogs based on fine-grained sentiment clustering

  • Authors:
  • Shi Feng;Daling Wang;Ge Yu;Wei Gao;Kam-Fai Wong

  • Affiliations:
  • Northeastern University, Institute of Computer Software and Theory, No.3-11 Wenhua Road, Heping District, Shenyang, China;Northeastern University, Institute of Computer Software and Theory, No.3-11 Wenhua Road, Heping District, Shenyang, China;Northeastern University, Institute of Computer Software and Theory, No.3-11 Wenhua Road, Heping District, Shenyang, China;The Chinese University of Hong Kong, Department of Systems Engineering and Engineering Management, Shatin, NT, Hong Kong;The Chinese University of Hong Kong, Department of Systems Engineering and Engineering Management, Shatin, NT, Hong Kong

  • Venue:
  • Knowledge and Information Systems - Special Issue: Best Papers of the Fifth International Conference on Advanced Data Mining and Applications (ADMA 2009)
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recently, blogs have emerged as the major platform for people to express their feelings and sentiments in the age of Web 2.0. The common emotions, which reflect people’s collective and overall sentiments, are becoming the major concern for governments, business companies and individual users. Different from previous literatures on sentiment classification and summarization, the major issue of common emotion extraction is to find out people’s collective sentiments and their corresponding distributions on the Web. Most existing blog clustering methods take into account keywords, stories or timelines but neglect the embedded sentiments, which are considered very important features of blogs. In this paper, a novel method based on Probabilistic Latent Semantic Analysis (PLSA) is presented to model the hidden sentiment factors and an emotion-oriented clustering approach is proposed to find common emotions according to the fine-grained sentiment similarity between blogs. Extensive experiments are conducted on real-world datasets consisting of different topics. The results show that our approach can partition blogs into sentiment coherent clusters and the extracted common emotion words afford good navigation guidelines for embedded sentiments in each cluster.