TweetLDA: supervised topic classification and link prediction in Twitter

Authors:
Daniele Quercia;Harry Askham;Jon Crowcroft
Affiliations:
University of Cambridge, United Kingdom;University of Cambridge, United Kingdom;University of Cambridge, United Kingdom
Venue:
Proceedings of the 3rd Annual ACM Web Science Conference
Year:
2012

Citing 3
Cited 2

Latent dirichlet allocation

The Journal of Machine Learning Research
Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Social links from latent topics in Microblogs

WSA '10 Proceedings of the NAACL HLT 2010 Workshop on Computational Linguistics in a World of Social Media

CatStream: categorising tweets for user profiling and stream filtering

Proceedings of the 2013 international conference on Intelligent user interfaces
Fragmented social media: a look into selective exposure to political news

Proceedings of the 22nd international conference on World Wide Web companion

Quantified Score

Hi-index	0.00

Visualization

Abstract

L-LDA is a new supervised topic model for assigning "topics" to a collection of documents (e.g., Twitter profiles). User studies have shown that L-LDA effectively performs a variety of tasks in Twitter that include not only assigning topics to profiles, but also re-ranking feeds, and suggesting new users to follow. Building upon these promising qualitative results, we here run an extensive quantitative evaluation of L-LDA. We test the extent to which, compared to the competitive baseline of Support Vector Machines (SVM), L-LDA is effective at two tasks: 1) assigning the correct topics to profiles; and 2) measuring the similarity of a profile pair. We find that L-LDA generally performs as well as SVM, and it clearly outperforms SVM when training data is limited, making it an ideal classification technique for infrequent topics and for (short) profiles of moderately active users. We have also built a web application that uses L-LDA to classify any given profile and graphically map predominant topics in specific geographic regions.