ACM Transactions on Database Systems (TODS)
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals
Data Mining and Knowledge Discovery
TelegraphCQ: continuous dataflow processing
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
TAG: a Tiny AGgregation service for Ad-Hoc sensor networks
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
An improved data stream summary: the count-min sketch and its applications
Journal of Algorithms
Stream Cube: An Architecture for Multi-Dimensional Analysis of Data Streams
Distributed and Parallel Databases
Network-Aware Operator Placement for Stream-Processing Systems
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
SPADE: the system s declarative stream processing engine
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
The ORCHESTRA Collaborative Data Sharing System
ACM SIGMOD Record
Fast and Reliable Stream Processing over Wide Area Networks
ICDEW '07 Proceedings of the 2007 IEEE 23rd International Conference on Data Engineering Workshop
Stream warehousing with DataDepot
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Do you know your IQ?: a research agenda for information quality in systems
ACM SIGMETRICS Performance Evaluation Review
Advances and challenges in log analysis
Communications of the ACM
DBToaster: higher-order delta processing for dynamic, frequently fresh views
Proceedings of the VLDB Endowment
REX: recursive, delta-based data-centric computation
Proceedings of the VLDB Endowment
Blink and it's done: interactive queries on very large data
Proceedings of the VLDB Endowment
Wide-area streaming analytics: distributing the data cube
Proceedings of the 4th annual Symposium on Cloud Computing
Aggregation and degradation in JetStream: streaming analytics in the wide area
NSDI'14 Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation
Hi-index | 0.00 |
Many data sets, such as system logs, are generated from widely distributed locations. Current distributed systems often discard this data because they lack the ability to backhaul it efficiently, or to do anything meaningful with it at the distributed sites. This leads to lost functionality, efficiency, and business opportunities. The problem with traditional backhaul approaches is that they are slow and costly, and require analysts to define the data they are interested in up-front. We propose a new architecture that stores data at the edge (i.e., near where it is generated) and supports rich real-time and historical queries on this data, while adjusting data quality to cope with the vagaries of wide-area bandwidth. In essence, this design transforms a distributed data collection system into a distributed data analysis system, where decisions about collection do not preclude decisions about analysis.