Chinese Blog Clustering by Hidden Sentiment Factors

  • Authors:
  • Shi Feng;Daling Wang;Ge Yu;Chao Yang;Nan Yang

  • Affiliations:
  • College of Information Science and Engineering, Northeastern University, Shenyang 110004;Key Laboratory of Medical Image Computing, Northeastern University, Ministry of Education, and College of Information Science and Engineering, Northeastern University, Shenyang 110004;Key Laboratory of Medical Image Computing, Northeastern University, Ministry of Education, and College of Information Science and Engineering, Northeastern University, Shenyang 110004;College of Information Science and Engineering, Northeastern University, Shenyang 110004;College of Information Science and Engineering, Northeastern University, Shenyang 110004

  • Venue:
  • ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the Web age, blogs have become the major platform for people to express their opinions and sentiments. The traditional blog clustering methods usually group blogs by keywords, stories or timelines, which do not consider opinions and emotions expressed in the articles. In this paper, a novel method based on Probabilistic Latent Semantic Analysis (PLSA) is presented to model the hidden emotion factors and an emotion-oriented clustering approach is proposed according to the sentiment similarities between Chinese blogs. Extensive experiments were conducted on real world blog datasets with different topics and the results show that our approach can cluster Chinese blogs into sentiment coherent groups to allow for better organization and easy navigation.