High performance mining of social media data

  • Authors:
  • Judith Gelernter;Gang Wu

  • Affiliations:
  • Carnegie Mellon University, Pittsburgh PA;Carnegie Mellon University, Pittsburgh, PA

  • Venue:
  • Proceedings of the 1st Conference of the Extreme Science and Engineering Discovery Environment: Bridging from the eXtreme to the campus and beyond
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

News and disaster-related applications may benefit from real-time processing of large-volume, up-to-the-minute social media data. Our geo-mining algorithm finds local place references (of street, building, toponym and place abbreviation) in Twitter messages so that those messages can be put on a map. The ability to map is significant because it can present a timely overview of a situation. Our current research demonstrates that our prototype desktop algorithm that geo-locates Twitter messages with an F statistic of .90 accuracy for location identification will be viable on a large scale and in real time, for actual applications. We present methods of managing external resources, threading the algorithm and balancing the data load, that allow us to scale up the application without significantly re-writing the code.