Theory of data stream computing: where to go

Authors:
S. Muthukrishnan
Affiliations:
Rutgers Univ, Piscataway, USA
Venue:
Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Year:
2011

Citing 12
Cited 0

Communication complexity

Communication complexity
Data streams: algorithms and applications

Foundations and Trends® in Theoretical Computer Science
Data Stream Management: Processing High-Speed Data Streams (Data-Centric Systems and Applications)

Data Stream Management: Processing High-Speed Data Streams (Data-Centric Systems and Applications)
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
On distributing symmetric streaming computations

Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Algorithms for distributed functional monitoring

Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Numerical linear algebra in the streaming model

Proceedings of the forty-first annual ACM symposium on Theory of computing
Graph Sparsification in the Semi-streaming Model

ICALP '09 Proceedings of the 36th Internatilonal Collogquium on Automata, Languages and Programming: Part II
Annotations in Data Streams

ICALP '09 Proceedings of the 36th International Colloquium on Automata, Languages and Programming: Part I
Differential privacy in new settings

SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
A model of computation for MapReduce

SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
Pan-private algorithms via statistics on sketches

Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Computing power has been growing steadily, just as communication rate and memory size. Simultaneously our ability to create data has been growing phenomenally and therefore the need to analyze it. We now have examples of massive data streams that are created in far higher rate than we can capture and store in memory economically, gathered in far more quantity than can be transported to central databases without overwhelming the communication infrastructure, and arrives far faster than we can compute with them in a sophisticated way. This phenomenon has challenged how we store, communicate and compute with data. Theories developed over past 50 years have relied on full capture, storage and communication of data. Instead, what we need for managing modern massive data streams are new methods built around working with less. The past 10 years have seen new theories emerge in computing (data stream algorithms), communication (compressed sensing), databases (data stream management systems) and other areas to address the challenges of massive data streams. Still, lot remains open and new applications of massive data streams have emerged recently. We present an overview of these challenges.