Applying Link-Based Classification to Label Blogs

  • Authors:
  • Smriti Bhagat;Graham Cormode;Irina Rozenbaum

  • Affiliations:
  • Rutgers University, USA;Rutgers University, USA;Rutgers University, USA

  • Venue:
  • Advances in Web Mining and Web Usage Analysis
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In analyzing data from social and communication networks, we encounter the problem of classifying objects where there is explicit link structure amongst the objects. We study the problem of inferring the classification of all the objects from a labeled subset, using only link-based information between objects. We abstract the above as a labeling problem on multigraphs with weighted edges. We present two classes of algorithms, based on local and global similarities. Then we focus on multigraphs induced by blog data, and carefully apply our general algorithms to specifically infer labels such as age, gender and location associated with the blog based only on the link-structure amongst them. We perform a comprehensive set of experiments with real, large-scale blog data sets and show that significant accuracy is possible from little or no non-link information, and our methods scale to millions of nodes and edges.