MJ no more: using concurrent wikipedia edit spikes with social network plausibility checks for breaking news detection

  • Authors:
  • Thomas Steiner;Seth van Hooland;Ed Summers

  • Affiliations:
  • Google Germany GmbH, Hamburg, Germany;Université Libre de Bruxelles, Brussels, Belgium;Library of Congress, Washington, DC, WA, USA

  • Venue:
  • Proceedings of the 22nd international conference on World Wide Web companion
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

We have developed an application called Wikipedia Live Monitor that monitors article edits on different language versions of Wikipedia--as they happen in realtime. Wikipedia articles in different languages are highly interlinked. For example, the English article "en:2013_Russian_meteor_event" on the topic of the February 15 meteoroid that exploded over the region of Chelyabinsk Oblast, Russia, is interlinked with "ru:ПaДehne_meteopnta_ha_Ypajie_B_2013_roДy?, the Russian article on the same topic. As we monitor multiple language versions of Wikipedia in parallel, we can exploit this fact to detect concurrent edit spikes of Wikipedia articles covering the same topics, both in only one, and in different languages. We treat such concurrent edit spikes as signals for potential breaking news events, whose plausibility we then check with full-text cross-language searches on multiple social networks. Unlike the reverse approach of monitoring social networks first, and potentially checking plausibility on Wikipedia second, the approach proposed in this paper has the advantage of being less prone to false-positive alerts, while being equally sensitive to true-positive events, however, at only a fraction of the processing cost. A live demo of our application is available online at the URL http://wikipedia-irc.herokuapp.com/, the source code is available under the terms of the Apache 2.0 license at https://github.com/tomayac/wikipedia-irc.