See what's enBlogue: real-time emergent topic identification in social media

  • Authors:
  • Foteini Alvanaki;Sebastian Michel;Krithi Ramamritham;Gerhard Weikum

  • Affiliations:
  • Saarland University, Germany;Saarland University, Germany;IIT Bombay, India;Max-Planck Institute Informatics, Germany

  • Venue:
  • Proceedings of the 15th International Conference on Extending Database Technology
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the increasing popularity of Web 2.0 streams, people become overwhelmed by the available information. This is partly countered by tagging blog posts and tweets, so that users can filter messages according to their tags. However, this is insufficient for detecting newly emerging topics that are not reflected by a single tag but are rather expressed by unusual tag combinations. This paper presents enBlogue, an approach for automatically detecting such emergent topics. EnBlogue uses a time-sliding window to compute statistics about tags and tag-pairs. These statistics are then used to identify unusual shifts in correlations, most of the time caused by real-world events. We analyze the strength of these shifts and measure the degree of unpredictability they include, used to rank tag-pairs expressing emergent topics. Additionally, this "indicator of surprise" is carried over to subsequent time points, as user interests do not abruptly vanish from one moment to the other. To avoid monitoring all tag-pairs we can also select a subset of tags, e. g., the most popular or volatile of them, to be used as seed-tags for subsequent pair-wise correlation computations. The system is fully implemented and publicly available on the Web, processing live Twitter data. We present experimental studies based on real world datasets demonstrating both the prediction quality by means of a user study and the efficiency of enBlogue.