Probabilistic Latent Semantic Analysis for Search and Mining of Corporate Blogs

  • Authors:
  • Flora S. Tsai;Yun Chen;Kap Luk Chan

  • Affiliations:
  • School of Electrical & Electronic Engineering, Nanyang Technological University, Singapore, 639798;School of Electrical & Electronic Engineering, Nanyang Technological University, Singapore, 639798;School of Electrical & Electronic Engineering, Nanyang Technological University, Singapore, 639798

  • Venue:
  • Proceedings of the 2008 conference on Applications of Data Mining in E-Business and Finance
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Blogs, or weblogs, have rapidly gained in popularity over the past decade. Because of the huge volume of existing blog posts, information in the blogosphere is difficult to access and retrieve. Existing studies have focused on analyzing personal blogs, but few have looked at corporate blogs, the numbers of which are dramatically rising. In this paper, we use probabilistic latent semantic analysis to detect keywords from corporate blogs with respect to certain topics. We then demonstrate how this method can represent the blogosphere in terms of topics with measurable keywords, hence tracking popular conversations and topics in the blogosphere. By applying a probabilistic approach, we can improve information retrieval in blog search and keywords detection, and provide an analytical foundation for the future of corporate blog search and mining.