Streaming trend detection in Twitter

  • Authors:
  • James Benhardus;Jugal Kalita

  • Affiliations:
  • Centre for Cognitive Science, University of Minnesota, Minneapolis, MN 55455, USA;Department of Computer Science, University of Colorado, Colorado Springs, CO 80918, USA

  • Venue:
  • International Journal of Web Based Communities
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

As social media continue to grow, the zeitgeist of society is increasingly found not in the headlines of traditional media institutions, but in the activity of ordinary individuals. The identification of trending topics utilises social media (such as Twitter) to provide an overview of the topics and issues that are currently popular within the online community. In this paper, we outline methodologies of detecting and identifying trending topics from streaming data. Data from Twitter's streaming API was collected and put into documents of equal duration using data collection procedures that allow for analysis over multiple timespans, including those not currently associated with Twitter-identified trending topics. Term frequency-inverse document frequency analysis and relative normalised term frequency analysis were performed on the documents to identify the trending topics. Relative normalised term frequency analysis identified unigrams, bigrams, and trigrams as trending topics, while term frequency-inverse document frequency analysis identified unigrams as trending topics. Application of these methodologies to streaming data resulted in F-measures ranging from 0.1468 to 0.7508.