Spotting code optimizations in data-parallel pipelines through PeriSCOPE

Authors:
Zhenyu Guo;Xuepeng Fan;Rishan Chen;Jiaxing Zhang;Hucheng Zhou;Sean McDirmid;Chang Liu;Wei Lin;Jingren Zhou;Lidong Zhou
Affiliations:
Microsoft Research Asia;Microsoft Research Asia and Huazhong University of Science and Technology;Microsoft Research Asia and Peking University;Microsoft Research Asia;Microsoft Research Asia;Microsoft Research Asia;Microsoft Research Asia and Shanghai Jiao Tong University;Microsoft Bing;Microsoft Bing;Microsoft Research Asia
Venue:
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Year:
2012

Citing 34
Cited 2

MULTILISP: a language for concurrent symbolic computation

ACM Transactions on Programming Languages and Systems (TOPLAS)
Compilers: principles, techniques, and tools

Compilers: principles, techniques, and tools
Efficiently computing static single assignment form and the control dependence graph

ACM Transactions on Programming Languages and Systems (TOPLAS)
Abstract interpretation

ACM Computing Surveys (CSUR)
Advanced compiler design and implementation

Advanced compiler design and implementation
Experimental study of minimum cut algorithms

SODA '97 Proceedings of the eighth annual ACM-SIAM symposium on Discrete algorithms
Optimizing compilers for modern architectures: a dependence-based approach

Optimizing compilers for modern architectures: a dependence-based approach
Conversion of control dependence to data dependence

POPL '83 Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Program slicing

ICSE '81 Proceedings of the 5th international conference on Software engineering
LINQ: reconciling object, relations and XML in the .NET framework

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Interpreting the data: Parallel analysis with Sawzall

Scientific Programming - Dynamic Grids and Worldwide Computing
Map-reduce-merge: simplified relational data processing on large clusters

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Dryad: distributed data-parallel programs from sequential building blocks

Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Pig latin: a not-so-foreign language for data processing

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Automatic optimization of parallel dataflow programs

ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
SCOPE: easy and efficient parallel processing of massive data sets

Proceedings of the VLDB Endowment
Inferring Dataflow Properties of User Defined Table Processors

SAS '09 Proceedings of the 16th International Symposium on Static Analysis
Distributed aggregation for data-parallel computing: interfaces and implementations

Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Building a high-level dataflow system on top of Map-Reduce: the Pig experience

Proceedings of the VLDB Endowment
Hive: a warehousing solution over a map-reduce framework

Proceedings of the VLDB Endowment
HadoopToSQL: a mapReduce query optimizer

Proceedings of the 5th European conference on Computer systems
On the relative completeness of bytecode analysis versus source code analysis

CC'08/ETAPS'08 Proceedings of the Joint European Conferences on Theory and Practice of Software 17th international conference on Compiler construction
FlumeJava: easy, efficient data-parallel pipelines

PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
MapReduce online

NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
DryadLINQ: a system for general-purpose distributed data-parallel computing using a high-level language

OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
HaLoop: efficient iterative data processing on large clusters

Proceedings of the VLDB Endowment
Automatic optimization for MapReduce programs

Proceedings of the VLDB Endowment
Nova: continuous Pig/Hadoop workflows

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Optimizing data partitioning for data-parallel computing

HotOS'13 Proceedings of the 13th USENIX conference on Hot topics in operating systems
Steno: automatic optimization of declarative queries

Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Re-optimizing data-parallel computing

NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
Optimizing data shuffling in data-parallel computation by understanding user-defined functions

NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
SCOPE: parallel databases meet MapReduce

The VLDB Journal — The International Journal on Very Large Data Bases

Coflow: a networking abstraction for cluster applications

Proceedings of the 11th ACM Workshop on Hot Topics in Networks
Leveraging endpoint flexibility in data-intensive clusters

Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM

Quantified Score

Hi-index	0.00

Visualization

Abstract

To minimize the amount of data-shuffling I/O that occurs between the pipeline stages of a distributed data-parallel program, its procedural code must be optimized with full awareness of the pipeline that it executes in. Unfortunately, neither pipeline optimizers nor traditional compilers examine both the pipeline and procedural code of a data-parallel program so programmers must either hand-optimize their program across pipeline stages or live with poor performance. To resolve this tension between performance and programmability, this paper describes PeriSCOPE, which automatically optimizes a data-parallel program's procedural code in the context of data flow that is reconstructed from the program's pipeline topology. Such optimizations eliminate unnecessary code and data, perform early data filtering, and calculate small derived values (e.g., predicates) earlier in the pipeline, so that less data--sometimes much less data--is transferred between pipeline stages. We describe how PeriSCOPE is implemented and evaluate its effectiveness on real production jobs.