Extracting evolution of web communities from a series of web archives

  • Authors:
  • Masashi Toyoda;Masaru Kitsuregawa

  • Affiliations:
  • University of Tokyo, Tokyo, Japan;University of Tokyo, Tokyo, Japan

  • Venue:
  • Proceedings of the fourteenth ACM conference on Hypertext and hypermedia
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recent advances in storage technology make it possible to store a series of large Web archives. It is now an exciting challenge for us to observe evolution of the Web. In this paper, we propose a method for observing evolution of web communities. A web community is a set of web pages created by individuals or associations with a common interest on a topic. So far, various link analysis techniques have been developed to extract web communities. We analyze evolution of web communities by comparing four Japanese web archives crawled from 1999 to 2002. Statistics of these archives and community evolution are examined, and the global behavior of evolution is described. Several metrics are introduced to measure the degree of web community evolution, such as growth rate, novelty, and stability. We developed a system for extracting detailed evolution of communities using these metrics. It allows us to understand when and how communities emerged and evolved. Some evolution examples are shown using our system.