Predicting semantic annotations on the real-time web

  • Authors:
  • Elham Khabiri;James Caverlee;Krishna Y. Kamath

  • Affiliations:
  • Texas A&M University, College Station, TX, USA;Texas A&M University, College Station, USA;Texas A&M University, College Station, USA

  • Venue:
  • Proceedings of the 23rd ACM conference on Hypertext and social media
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The explosion of the real-time web has spurred a growing need for new methods to organize, monitor, and distill relevant information from these large-scale social streams. One especially encouraging development is the self-curation of the real-time web via user-driven linking, in which users annotate their own status updates with lightweight semantic annotations -- or hashtags. Unfortunately, there is evidence that hashtag growth is not keeping pace with the growth of the overall real-time web. In a random sample of 3 million tweets, we find that only 10.2% contain at least one hashtag. Hence, in this paper we explore the possibility of predicting hashtags for un-annotated status updates. Toward this end, we propose and evaluate a graph-based prediction framework. Three of the unique features of the approach are: (i) a path aggregation technique for scoring the closeness of terms and hashtags in the graph; (ii) pivot term selection, for identifying high value terms in status updates; and (iii) a dynamic sliding window for recommending hashtags reflecting the current status of the real-time web. Experimentally we find encouraging results in comparison with Bayesian and data mining-based approaches.