Wide-area streaming analytics: distributing the data cube

Authors:
Benjamin Heintz;Abhishek Chandra;Ramesh K. Sitaraman
Affiliations:
University of Minnesota;University of Minnesota;University of Massachusetts & Akamai Technologies
Venue:
Proceedings of the 4th annual Symposium on Cloud Computing
Year:
2013

Citing 5
Cited 0

Implementing data cubes efficiently

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals

Data Mining and Knowledge Discovery
ROLAP implementations of the data cube

ACM Computing Surveys (CSUR)
The Akamai network: a platform for high-performance internet applications

ACM SIGOPS Operating Systems Review
Making every bit count in wide-area analytics

HotOS'13 Proceedings of the 14th USENIX conference on Hot Topics in Operating Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

To date, much research in data-intensive computing has focused on batch computation. Increasingly, however, it is necessary to derive knowledge from big data streams. As a motivating example, consider a content delivery network (CDN) such as Akamai [4], comprising thousands of servers in hundreds of globally distributed locations. Each of these servers produces a stream of log data, recording for example every user it serves, along with each video stream they access, when they play and pause streams, and more. Each server also records network- and system-level data such as TCP connection statistics. In aggregate, the servers produce billions of lines of log data from over a thousand locations daily.