ACM Computing Surveys (CSUR)
Eddies: continuously adaptive query processing
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Database Management Systems
ACM Transactions on Computer Systems (TOCS)
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals
Data Mining and Knowledge Discovery
Automated Selection of Materialized Views and Indexes in SQL Databases
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Data placement in shared-nothing parallel database systems
The VLDB Journal — The International Journal on Very Large Data Bases
Interpreting the data: Parallel analysis with Sawzall
Scientific Programming - Dynamic Grids and Worldwide Computing
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Dryad: distributed data-parallel programs from sequential building blocks
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Foundations and Trends in Databases
Pig latin: a not-so-foreign language for data processing
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Diamond: a storage architecture for early discard in interactive search
FAST'04 Proceedings of the 3rd USENIX conference on File and storage technologies
MapReduce optimization using regulated dynamic prioritization
Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Comet: batched stream processing for data intensive distributed computing
Proceedings of the 1st ACM symposium on Cloud computing
Nephele/PACTs: a programming model and execution framework for web-scale analytical processing
Proceedings of the 1st ACM symposium on Cloud computing
Towards automatic optimization of MapReduce programs
Proceedings of the 1st ACM symposium on Cloud computing
A comparison of join algorithms for log processing in MaPreduce
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
HotOS'09 Proceedings of the 12th conference on Hot topics in operating systems
NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
Scripting the cloud with skywriting
HotCloud'10 Proceedings of the 2nd USENIX conference on Hot topics in cloud computing
MRShare: sharing across multiple queries in MapReduce
Proceedings of the VLDB Endowment
Nectar: automatic management of data and computation in datacenters
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Column-oriented storage techniques for MapReduce
Proceedings of the VLDB Endowment
Processing theta-joins using MapReduce
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Llama: leveraging columnar storage for scalable join processing in the MapReduce framework
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
CoHadoop: flexible data placement and its exploitation in Hadoop
Proceedings of the VLDB Endowment
Adaptive MapReduce using situation-aware mappers
Proceedings of the 15th International Conference on Extending Database Technology
Stubby: a transformation-based optimizer for MapReduce workflows
Proceedings of the VLDB Endowment
Spotting code optimizations in data-parallel pipelines through PeriSCOPE
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Just-in-time data distribution for analytical query processing
ADBIS'12 Proceedings of the 16th East European conference on Advances in Databases and Information Systems
Modeling and optimizing large-scale data flows
Future Generation Computer Systems
Hi-index | 0.00 |
Large-scale parallel dataflow systems, e.g., Dryad and Map-Reduce, have attracted significant attention recently. High-level dataflow languages such as Pig Latin and Sawzall are being layered on top of these systems, to enable faster program development and more maintainable code. These languages engender greater transparency in program structure, and open up opportunities for automatic optimization. This paper proposes a set of optimization strategies for this context, drawing on and extending techniques from the database community.