A framework for summarizing and analyzing twitter feeds

  • Authors:
  • Xintian Yang;Amol Ghoting;Yiye Ruan;Srinivasan Parthasarathy

  • Affiliations:
  • The Ohio State University, Columbus, OH, USA;IBM Thomas J. Watson Research Center, Yorktown Heights, NY, USA;The Ohio State University, Columbus, OH, USA;The Ohio State University, Columbus, OH, USA

  • Venue:
  • Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The firehose of data generated by users on social networking and microblogging sites such as Facebook and Twitter is enormous. Real-time analytics on such data is challenging with most current efforts largely focusing on the efficient querying and retrieval of data produced recently. In this paper, we present a dynamic pattern driven approach to summarize data produced by Twitter feeds. We develop a novel approach to maintain an in-memory summary while retaining sufficient information to facilitate a range of user-specific and topic-specific temporal analytics. We empirically compare our approach with several state-of-the-art pattern summarization approaches along the axes of storage cost, query accuracy, query flexibility, and efficiency using real data from Twitter. We find that the proposed approach is not only scalable but also outperforms existing approaches by a large margin.