Mining blog stories using community-based and temporal clustering

  • Authors:
  • Arun Qamra;Belle Tseng;Edward Y. Chang

  • Affiliations:
  • UC Santa Barbara;NEC Labs America, Cupertino;UC Santa Barbara

  • Venue:
  • CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In recent years, weblogs, or blogs for short, have become an important form of online content. The personal nature of blogs, online interactions between bloggers, and the temporal nature of blog entries, differentiate blogs from other kinds of Web content. Bloggers interact with each other by linking to each other's posts, thus forming online communities. Within these communities, bloggers engage in discussions of certain issues, through entries in their blogs. Since these discussions are often initiated in response to online or offline events, a discussion typically lasts for a limited time duration. We wish to extract such temporal discussions, or stories, occurring within blogger communities, based on some query keywords. We propose a Content-Community-Time model that can leverage the content of entries, their timestamps, and the community structure of the blogs, to automatically discover stories. Doing so also allows us to discover hot stories. We demonstrate the effectiveness of our model through several case studies using real-world data collected from the blogosphere.