Incremental maintenance of views with duplicates
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Maintenance of data cubes and summary tables in a warehouse
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Data Integration using Self-Maintainable Views
EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Performance Issues in Incremental Warehouse Maintenance
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Bigtable: a distributed storage system for structured data
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Stateful bulk processing for incremental analytics
Proceedings of the 1st ACM symposium on Cloud computing
DryadInc: reusing work in large-scale computations
HotCloud'09 Proceedings of the 2009 conference on Hot topics in cloud computing
Large-scale incremental processing using distributed transactions and notifications
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Distributed cube materialization on holistic measures
ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
i2MapReduce: incremental iterative MapReduce
Proceedings of the 2nd International Workshop on Cloud Intelligence
Hi-index | 0.00 |
This paper explores the application of view maintenance techniques in a MapReduce environment. Abstractly, a MapReduce program can be seen as a view definition and the computed result as a materialized view. As yet, MapReduce programs need to be re-executed to obtain up-to-date results after base data has changed, i.e. the view is recomputed from scratch. We present a case study based on typical MapReduce programs mentioned in Google's original MapReduce paper. By adapting view maintenance techniques, we were able to recompute results in an incremental fashion considerably more efficiently. Based on the case study, we develop a general solution for the incremental maintenance of the class of MapReduce programs that compute self-maintainable aggregates.