Programming with sets; an introduction to SETL
Programming with sets; an introduction to SETL
Compiling collection-oriented languages onto massively parallel computers
Journal of Parallel and Distributed Computing - Massively parallel computation
Vector models for data-parallel computing
Vector models for data-parallel computing
SISAL versus FORTRAN: a comparison using the Livermore loops
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Radix sort for vector multiprocessors
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Parallel functional languages and compilers
Fortran 90 handbook: complete ANSI/ISO reference
Fortran 90 handbook: complete ANSI/ISO reference
Transforming high-level data-parallel programs into vector operations
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
The high performance Fortran handbook
The high performance Fortran handbook
A parallel hashed Oct-Tree N-body algorithm
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Parallel programming in Split-C
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Compiling nested data-parallel programs for shared-memory multiprocessors
ACM Transactions on Programming Languages and Systems (TOPLAS)
Implementation of a portable nested data-parallel language
Journal of Parallel and Distributed Computing - Special issue on data parallel algorithms and programming
Implications of hierarchical N-body methods for multiprocessor architectures
ACM Transactions on Computer Systems (TOCS)
Cilk: an efficient multithreaded runtime system
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Solving linear recurrences with loop raking
Journal of Parallel and Distributed Computing
Provably efficient scheduling for languages with fine-grained parallelism
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
A manual for the CHAOS runtime library
A manual for the CHAOS runtime library
Object-oriented parallel computation for plasma simulation
Communications of the ACM - Special issue on object-oriented experiences and future trends
Programming parallel algorithms
Communications of the ACM
On parallel object oriented programming in Fortran 90
ACM SIGAPP Applied Computing Review
A new model for integrated nested task and data parallel programming
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
High performance Fortran for highly irregular problems
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Space-efficient implementation of nested parallelism
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Fortran 90/95 explained (2nd ed.)
Fortran 90/95 explained (2nd ed.)
Basic Linear Algebra Subprograms for Fortran Usage
ACM Transactions on Mathematical Software (TOMS)
Multi-processor performance on the Tera MTA
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Pthreads for dynamic and irregular parallelism
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Task Parallelism in a High Performance Fortran Framework
IEEE Parallel & Distributed Technology: Systems & Technology
Piecewise Execution of Nested Data-Parallel Programs
LCPC '95 Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing
Euro-Par '98 Proceedings of the 4th International Euro-Par Conference on Parallel Processing
Enlarging the Scope of Vector-Based Computations: Extending Fortran 90 by Nested Data Parallelism
APDC '97 Proceedings of the 1997 Advances in Parallel and Distributed Computing Conference (APDC '97)
An object-oriented approach to nested data parallelism
FRONTIERS '95 Proceedings of the Fifth Symposium on the Frontiers of Massively Parallel Computation (Frontiers'95)
The nested rectangular array as a model of data
APL '79 Proceedings of the international conference on APL: part 1
PMMP '95 Proceedings of the conference on Programming Models for Massively Parallel Computers
Solving irregularly structured problems based on distributed object model
Parallel Computing - Special issue: Parallel and distributed scientific and engineering computing
A tutorial on parallel and concurrent programming in Haskell
AFP'08 Proceedings of the 6th international conference on Advanced functional programming
Hi-index | 0.00 |
Modern dialects of Fortran enjoy wide use and good support on high-performance computers as performance-oriented programming languages. By providing the ability to express nested data parallelism, modern Fortran dialects enable irregular computations to be incorporated into existing applications with minimal rewriting and without sacrificing performance within the regular portions of the application. Since performance of nested data-parallel computation is unpredictable and often poor using current compilers, we investigate threading and flattening, two source-to-source transformation techniques that can improve performance and performance stability. For experimental validation of these techniques, we explore nested data-parallel implementations of the sparse matrix-vector product and the Barnes-Hut $n$-body algorithm by hand-coding thread-based (using OpenMP directives) and flattening-based versions of these algorithms and evaluating their performance on an SGI Origin 2000 and an NEC SX-4, two shared-memory machines.