Partitioning and Mapping Algorithms into Fixed Size Systolic Arrays
IEEE Transactions on Computers
Regular interactive algorithms and their implementations on processor arrays
Regular interactive algorithms and their implementations on processor arrays
A Partitioning Strategy for Nonuniform Problems on Multiprocessors
IEEE Transactions on Computers
Efficient handling of data structures in definitional languages
Science of Computer Programming
Synthesizing Linear Array Algorithms from Nested FOR Loop Algorithms
IEEE Transactions on Computers
Total domination and irredundance in weighted interval graphs
SIAM Journal on Discrete Mathematics
Data optimization: allocation of arrays to reduce communication on SIMD machines
Journal of Parallel and Distributed Computing - Massively parallel computation
The balanced binary tree technique on mesh-connected computers
Information Processing Letters
Concurrent programming: principles and practice
Concurrent programming: principles and practice
Performance of hashed cache data migration schemes on multicomputers
Journal of Parallel and Distributed Computing
Optimal expression evaluation for data parallel architectures
Journal of Parallel and Distributed Computing
The data alignment phase in compiling programs for distributed-memory machines
Journal of Parallel and Distributed Computing
Generating parallel code for SIMD machines
ACM Letters on Programming Languages and Systems (LOPLAS)
Complexity of the closest vector problem in a lattice generated by (0,1)-matrix
Information Processing Letters
Efficiency of data alignment on Maspar
ACM SIGPLAN Notices - Workshop on languages, compilers and run-time environments for distributed memory multiprocessors
Mapping unstructured grid computations to massively parallel computers
Mapping unstructured grid computations to massively parallel computers
Data and task alignment in distributed memory architectures
Journal of Parallel and Distributed Computing - Special issue on data parallel algorithms and programming
ACM Transactions on Mathematical Software (TOMS)
The parallel execution of DO loops
Communications of the ACM
Adaptive Methods for Partial Differential Equations
Adaptive Methods for Partial Differential Equations
Automatic Program Construction Techniques
Automatic Program Construction Techniques
Mapping Nested Loop Algorithms into Multidimensional Systolic Arrays
IEEE Transactions on Parallel and Distributed Systems
Partitioning and Mapping Nested Loops on Multiprocessor Systems
IEEE Transactions on Parallel and Distributed Systems
Compiling Global Name-Space Parallel Loops for Distributed Execution
IEEE Transactions on Parallel and Distributed Systems
Reduction Operations on a Distributed Memory Machine with a Reconfigurable Interconnection Network
IEEE Transactions on Parallel and Distributed Systems
Algorithmic Graph Theory and Perfect Graphs (Annals of Discrete Mathematics, Vol 57)
Algorithmic Graph Theory and Perfect Graphs (Annals of Discrete Mathematics, Vol 57)
IEEE Transactions on Software Engineering
Hi-index | 0.00 |
There is a need for compiler technology that, given the sourceprogram, will generate efficient parallel codes for differentarchitectures with minimal user involvement. Parallel computationis becoming indispensable in solving large-scale problems inscience and engineering. Yet, the use of parallel computation islimited by the high costs of developing the needed software. Toovercome this difficulty we advocate a comprehensive approach tothe development of scalable architecture-independent software forscientific computation based on our experience with equationalprogramming language (EPL). Our approach is based on a programdecomposition, parallel code synthesis, and run-time support forparallel scientific computation. The program decomposition isguided by the source program annotations provided by the user. Thesynthesis of parallel code is based on configurations that describethe overall computation as a set of interacting components.Run-time support is provided by the compiler-generated code thatredistributes computation and data during object program execution.The generated parallel code is optimized using techniques of dataalignment, operator placement, wavefront determination, and memoryoptimization. In this article we discuss annotations,configurations, parallel code generation, and run-time supportsuitable for parallel programs written in the functional parallelprogramming language EPL and in Fortran.