Theory of data stream computing: where to go

  • Authors:
  • S. Muthukrishnan

  • Affiliations:
  • Rutgers Univ, Piscataway, USA

  • Venue:
  • Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Computing power has been growing steadily, just as communication rate and memory size. Simultaneously our ability to create data has been growing phenomenally and therefore the need to analyze it. We now have examples of massive data streams that are created in far higher rate than we can capture and store in memory economically, gathered in far more quantity than can be transported to central databases without overwhelming the communication infrastructure, and arrives far faster than we can compute with them in a sophisticated way. This phenomenon has challenged how we store, communicate and compute with data. Theories developed over past 50 years have relied on full capture, storage and communication of data. Instead, what we need for managing modern massive data streams are new methods built around working with less. The past 10 years have seen new theories emerge in computing (data stream algorithms), communication (compressed sensing), databases (data stream management systems) and other areas to address the challenges of massive data streams. Still, lot remains open and new applications of massive data streams have emerged recently. We present an overview of these challenges.