Processing and visualizing the data in tweets

Authors:
Adam Marcus;Michael S. Bernstein;Osama Badar;David R. Karger;Samuel Madden;Robert C. Miller
Affiliations:
MIT CSAIL;MIT CSAIL;MIT CSAIL;MIT CSAIL;MIT CSAIL;MIT CSAIL
Venue:
ACM SIGMOD Record
Year:
2012

Citing 7
Cited 3

Eddies: continuously adaptive query processing

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
WSQ/DSQ: a practical approach for combined querying of databases and the Web

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Partial results for online query processing

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Aurora: a new model and architecture for data stream management

The VLDB Journal — The International Journal on Very Large Data Bases
Pig latin: a not-so-foreign language for data processing

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Twitinfo: aggregating and visualizing microblogs for event exploration

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Tweets from Justin Bieber's heart: the dynamics of the location field in user profiles

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems

An in-browser microblog ranking engine

ER'12 Proceedings of the 2012 international conference on Advances in Conceptual Modeling
uTrack: track yourself! monitoring information on online social media

Proceedings of the 22nd international conference on World Wide Web companion
An algorithm for local geoparsing of microtext

Geoinformatica

Quantified Score

Hi-index	0.00

Visualization

Abstract

Microblogs such as Twitter provide a valuable stream of diverse user-generated data. While the data extracted from Twitter is generally timely and accurate, the process by which developers extract structured data from the tweet stream is ad-hoc and requires reimplementation of common data manipulation primitives. In this paper, we present two systems for querying and extracting structure from Twitter-embedded data. The first, TweeQL, provides a streaming SQL-like interface to the Twitter API, making common tweet processing tasks simpler. The second, TwitInfo, shows how end-users can interact with and understand aggregated data from the tweet stream, in addition to showcasing the power of the TweeQL language. Together these systems show the richness of content that can be extracted from Twitter.