Exploring the community structure of newsgroups

  • Authors:
  • Christian Borgs;Jennifer Chayes;Mohammad Mahdian;Amin Saberi

  • Affiliations:
  • Microsoft Research, Redmond, WA;Microsoft Research, Redmond, WA;MIT, Cambridge, MA;Georgia Institute of Technology, Atlanta, GA

  • Venue:
  • Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose to use the community structure of Usenet for organizing and retrieving the information stored in newsgroups. In particular, we study the network formed by cross-posts, messages that are posted to two or more newsgroups simultaneously. We present what is, to our knowledge, by far the most detailed data that has been collected on Usenet cross-postings. We analyze this network to show that it is a small-world network with significant clustering. We also present a spectral algorithm which clusters newsgroups based on the cross-post matrix. The result of our clustering provides a topical classification of newsgroups. Our clustering gives many examples of significant relationships that would be missed by semantic clustering methods.