An analysis of topical proximity in the twitter social graph

  • Authors:
  • Markus Schaal;John O'Donovan;Barry Smyth

  • Affiliations:
  • University College Dublin, Belfield, Dublin, Ireland;University of California, Santa Barbara;University College Dublin, Belfield, Dublin, Ireland

  • Venue:
  • SocInfo'12 Proceedings of the 4th international conference on Social Informatics
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Standard approaches of information retrieval are increasingly complemented by social search even when it comes to rational information needs. Twitter, as a popular source of real-time information, plays an important role in this respect, as both the follower-followee graph and the many relationships among users provide a rich set of information pieces about the social network. However, many hidden factors must be considered if social data are to successfully support the search for high-quality information. Here we focus on one of these factors, namely the relationship between content similarity and social distance in the social network. We compared two methods to compute content similarity among twitter users in a one-per-user document collection, one based on standard term frequency vectors, the other based on topic associations obtained by Latent Dirichlet Allocation (LDA). By comparing these metrics at different hop distances in the social graph we investigated the utility of prominent features such as Retweets and Hashtags as predictors of similarity, and demonstrated the advantages of topical proximity vs. textual similarity for friend recommendations.