Mining communities on the web using a max-flow and a site-oriented framework

  • Authors:
  • Yasuhito Asano;Takao Nishizeki;Masashi Toyoda

  • Affiliations:
  • Graduate School of Information Sciences, Tohoku University, Sendai, Japan;Graduate School of Information Sciences, Tohoku University, Sendai, Japan;Institute of Industrial Science, the Univeristy of Tokyo, Tokyo, Japan

  • Venue:
  • WISE'05 Proceedings of the 6th international conference on Web Information Systems Engineering
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

There are several methods for mining communities on the Web using hyperlinks. One of the well-known ones is a max-flow based method proposed by Flake et al. The method adopts a page-oriented framework, that is, it uses a page on the Web as a unit of information, like other methods including HITS and trawling. Recently, Asano et al. built a site-oriented framework which uses a site as a unit of information, and they experimentally showed that trawling on the site-oriented framework often outputs significantly better communities than trawling on the page-oriented framework. However, it has not been known whether the site-oriented framework is effective in mining communities through the max-flow based method. In this paper, we first point out several problems of the max-flow based method, mainly owing to the page-oriented framework, and then propose solutions to the problems by utilizing several advantages of the site-oriented framework. Computational experiments reveal that our max-flow based method on the site-oriented framework is significantly effective in mining communities, related to the topics of given pages, in comparison with the original max-flow based method on the page-oriented framework.