TweetLDA: supervised topic classification and link prediction in Twitter

  • Authors:
  • Daniele Quercia;Harry Askham;Jon Crowcroft

  • Affiliations:
  • University of Cambridge, United Kingdom;University of Cambridge, United Kingdom;University of Cambridge, United Kingdom

  • Venue:
  • Proceedings of the 3rd Annual ACM Web Science Conference
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

L-LDA is a new supervised topic model for assigning "topics" to a collection of documents (e.g., Twitter profiles). User studies have shown that L-LDA effectively performs a variety of tasks in Twitter that include not only assigning topics to profiles, but also re-ranking feeds, and suggesting new users to follow. Building upon these promising qualitative results, we here run an extensive quantitative evaluation of L-LDA. We test the extent to which, compared to the competitive baseline of Support Vector Machines (SVM), L-LDA is effective at two tasks: 1) assigning the correct topics to profiles; and 2) measuring the similarity of a profile pair. We find that L-LDA generally performs as well as SVM, and it clearly outperforms SVM when training data is limited, making it an ideal classification technique for infrequent topics and for (short) profiles of moderately active users. We have also built a web application that uses L-LDA to classify any given profile and graphically map predominant topics in specific geographic regions.