Blog search and mining in the business domain

  • Authors:
  • Yun Chen;Flora S. Tsai;Kap Luk Chan

  • Affiliations:
  • Nanyang Technological University, Singapore;Nanyang Technological University, Singapore;Nanyang Technological University, Singapore

  • Venue:
  • Proceedings of the 2007 international workshop on Domain driven data mining
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Weblogs, or blogs, have rapidly gained in popularity over the past few years. In particular, the growth of business blogs written by or providing commentary on businesses and companies opens up new opportunities for developing blog-specific search and mining techniques. In this paper, we propose probabilistic models for blog search and mining using two machine learning techniques, Latent Semantic Analysis (LSA) and Probabilistic Latent Semantic Analysis (PLSA). We implement the models in our database of business blogs, with the aim of achieving higher precision and recall. The probabilistic model is able to segment the business blogs into separate topic areas, which is useful for keywords detection on the blogosphere. Various term-weighting schemes and factor values were also studied in detail, which reveal interesting patterns in our database of business blogs. From our study, we can uncover domain-driven data mining techniques that can better strengthen business intelligence in complex enterprise applications.