Discovery of user communities based on terms of web log data

  • Authors:
  • Tsuyoshi Murata

  • Affiliations:
  • Department of Computer Science, Graduate School of Information Science and Engineering, Tokyo Institute of Technology, Ookayama, Meguro, Tokyo, Japan

  • Venue:
  • New Generation Computing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Web is a huge network composed of Web pages and hyperlinks. It is often reported that related Web pages are densely linked with each other. Finding groups of such related pages, which are called Web communities, is important for information retrieval from the Web. Several attempts have been made for the discovery of Web communities such as Kumar's trawling and Flake's method. In addition to the communities of related Web pages, there are communities of users sharing common interests. Finding the latter communities, which we called user communities in this paper, is also important for clarifying the behaviors of Web users. It is expected that the characteristics of user communities in the Web correspond to those in real human communities. A method for discovering user communities is described in this paper. Client-level log data (Web audience measurement data) is used as the data of users' Web watching behaviors. Maximal complete bipartite graphs are searched from term-user graph obtained from the log data without analyzing the contents of Web pages. Experimental results show that our method succeeds in discovering many interesting user communities with labels that characterize the communities.