The connection machine
The C++ programming language
The C programming language
Communications of the ACM - Special issue on parallelism
Advanced compiler optimizations for supercomputers
Communications of the ACM - Special issue on parallelism
Designing efficient algorithms for parallel computers
Designing efficient algorithms for parallel computers
Implementing Quicksort programs
Communications of the ACM
Parallel programming with coordination structures
POPL '91 Proceedings of the 18th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
A production-quality C* compiler for Hypercube multicomputers
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Generating explicit communication from shared-memory program references
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Architecture-independent scientific programming in data parallel C: three case studies
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Compiling nested data-parallel programs for shared-memory multiprocessors
ACM Transactions on Programming Languages and Systems (TOPLAS)
A comparison of message passing and shared memory architectures for data parallel programs
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Loop Transformations for Fault Detection in Regular Loops on Massively Parallel Systems
IEEE Transactions on Parallel and Distributed Systems
Pipelined Data Parallel Algorithms-II: Design
IEEE Transactions on Parallel and Distributed Systems
Compiling Communication-Efficient Programs for Massively Parallel Machines
IEEE Transactions on Parallel and Distributed Systems
Data-Parallel Programming on MIMD Computers
IEEE Transactions on Parallel and Distributed Systems
Language support for data parallelism in pointer based dynamic data structures
CASCON '93 Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: distributed computing - Volume 2
Hi-index | 0.00 |
A data parallel language such as C* has a number of advantages over conventional hypercube programming languages. The algorithm design process is simpler, because (1) message passing is invisible, (2) race conditions are nonexistent, and (3) the data can be put into a one-to-one correspondence with the virtual processors. Since data are mapped to virtual processors, rather than physical processors, it is easier to move algorithms implemented on one size hypercube to a larger or smaller system. We outline the design of a C* compiler for a hypercube multicomputer. Our design goals are to minimize the amount of time spent synchronizing, limit the number of interprocessor communications, and make each physical processor's emulation of a set of virtual processors as efficient as possible. We have hand translated three benchmark programs and compared their performance with that of ordinary C programs. All three programs—matrix multiplication, LU decomposition, and hyperquicksort—achieve reasonable speedup on a commercial hypercube, even when solving problems of modest size. On a 64-processor NCUBE/7, the C* matrix multiplication program achieves a speedup of 27 when multiplying two 64 × 64 matrices, the hyperquicksort program achieves a speedup of 10 when sorting 16,384 integers, and LU decomposition attains a speedup of 7 when decomposing a 256 × 256 system of linear equations. We believe the degradation in machine performance resulting from the use of a data parallel language will be more than compensated for by the increase in programmer productivity.