Compiling Fortran D for MIMD distributed-memory machines
Communications of the ACM
SIGSOFT '94 Proceedings of the 2nd ACM SIGSOFT symposium on Foundations of software engineering
A model and compilation strategy for out-of-core data parallel programs
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Input/output characteristics of scalable parallel applications
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
The Vesta parallel file system
ACM Transactions on Computer Systems (TOCS)
Automatic compiler-inserted I/O prefetching for out-of-core applications
OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
The virtual microscope
PPOPP '90 Proceedings of the second ACM SIGPLAN symposium on Principles & practice of parallel programming
Improving the Performance of Out-of-Core Computations
ICPP '97 Proceedings of the international Conference on Parallel Processing
Infrastructure for Building Parallel Database Systems for Multi-Dimensional Data
IPPS '99/SPDP '99 Proceedings of the 13th International Symposium on Parallel Processing and the 10th Symposium on Parallel and Distributed Processing
Compiler support for out-of-core arrays on parallel machines
FRONTIERS '95 Proceedings of the Fifth Symposium on the Frontiers of Massively Parallel Computation (Frontiers'95)
Optimizing array reference checking in Java programs
IBM Systems Journal
Compiler supported high-level abstractions for sparse disk-resident datasets
ICS '02 Proceedings of the 16th international conference on Supercomputing
Processing large-scale multi-dimensional data in parallel and distributed environments
Parallel Computing - Parallel data-intensive algorithms and applications
Compiler-Directed I/O Optimization
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Compiling Data Intensive Applications with Spatial Coordinates
LCPC '00 Proceedings of the 13th International Workshop on Languages and Compilers for Parallel Computing-Revised Papers
Compiler support for efficient processing of XML datasets
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Time and space optimization for processing groups of multi-dimensional scientific queries
Proceedings of the 18th annual international conference on Supercomputing
IEEE Transactions on Knowledge and Data Engineering
Supporting SQL-3 aggregations on grid-based data repositories
LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
Hi-index | 0.00 |
Processing and analyzing large volumes of data plays an increasingly important role in many domains of scientific research. High-level language and compiler support for developing applications that analyze and process such datasets has, however, been lacking so far.In this paper, we present a set of language extensions and a prototype compiler for supporting high-level object-oriented programming of data intensive reduction operations over multidimensional data. We have chosen a dialect of Java with data-parallel extensions for specifying collection of objects, a parallel for loop, and reduction variables as our source high-level language. Our compiler analyzes parallel loops and optimizes the processing of datasets through the use of an existing run-time system, called Active Data Repository (ADR). We show how loop fission followed by interprocedural static program slicing can be used by the compiler to extract required information for the run-time system. We present the design of a compiler/n-time interface which allows the compiler to effectively utilize the existing run-time system.A prototype compiler incorporating these techniques has been developed using the Titanium front-end from Berkeley. We have evaluated this compiler by comparing the performance of compiler generated code with hand customized ADR code for three templates, from the areas of digital microscopy and scientific simulations. Our experimental results show that the performance of compiler generated versions is, on the average 21% lower, and in all cases within a factor of two, of the performance of hand coded versions.