Compiler supported high-level abstractions for sparse disk-resident datasets
ICS '02 Proceedings of the 16th international conference on Supercomputing
Time and space optimization for processing groups of multi-dimensional scientific queries
Proceedings of the 18th annual international conference on Supercomputing
Hi-index | 0.00 |
Abstract: Processing and analyzing large volumes of data plays an increasingly important role in many domains of scientific research. We are developing a compiler that processes data intensive applications written in a dialect of Java and compiles them for efficient execution on distributed memory parallel machines. In this paper, we focus on the problem of generating correct and efficient communication for data intensive applications. We present static analysis techniques for 1) extracting a global reduction function from a data parallel loop, and 2) determining if a subscript function is monotonic. We also present a runtime technique for reducing the volume of communication during the global reduction phase. We have experimented with two data intensive applications to evaluate the efficacy of our techniques. Our results show that 1) our techniques for extracting global reduction functions and establishing monotonicity of subscript functions can successfully handle these applications, 2) significant reduction in communication volume and execution times is achieved through our runtime analysis technique, 3) runtime communication analysis is critical for achieving speedups on parallel configurations.