Improving the classification of newsgroup messages through social network analysis

  • Authors:
  • Blaz Fortuna;Eduarda Mendes Rodrigues;Natasa Milic-Frayling

  • Affiliations:
  • Institute Jožef Stefan, Ljubljana, Slovenia;Microsoft Research Ltd., Cambridge, United Kingdom;Microsoft Research Ltd., Cambridge, United Kingdom

  • Venue:
  • Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Improving the classification of newsgroup messages through social network analysis. In this paper, we focus on automatic classification of message replies into several types. For representing messages we consider rich feature sets that combine the standard author reply-to network properties with features derived from four additional structures identified in the data: 1) a network of authors who participate in the same threads, 2) network of authors who post similar content, 3) network of threads sharing common authors, and 4) network of content-related threads. For selected newsgroups we train linear SVM classifiers to identify agreement and disagreement with the original message, and question and answer patterns in the threads. We show that the use of newly defined features substantially improves classification of messages in comparison with the SVM model based only on the standard reply-to network.