Operating system concepts (2nd ed.)
Operating system concepts (2nd ed.)
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
An integrated compile-time/run-time software distributed shared memory system
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Quantifying the performance differences between PVM and TreadMarks
Journal of Parallel and Distributed Computing
Programming with POSIX threads
Programming with POSIX threads
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Enhancing Software DSM for Compiler-Parallelized Applications
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Transparent adaptive parallelism on NOWs using OpenMP
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
A Design Methodology for Data-Parallel Applications
IEEE Transactions on Software Engineering - Special issue on architecture-independent languages and software tools parallel processing
Program Development Tools for Clusters of Shared Memory Multiprocessors
The Journal of Supercomputing
Is data distribution necessary in OpenMP?
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Contention elimination by replication of sequential sections in distributed shared memory programs
PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
OpenMP on networks of workstations for software DSMs
Journal of Computer Science and Technology
Runtime vs. Manual Data Distribution for Architecture-Agnostic Shared-Memory Programming Models
International Journal of Parallel Programming
A Fully Compliant OpenMP Implementationon Software Distributed Shared Memory
HiPC '02 Proceedings of the 9th International Conference on High Performance Computing
Controlling Distributed Shared Memory Consistency from High Level Programming Languages
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
PaCT '999 Proceedings of the 5th International Conference on Parallel Computing Technologies
Performance Evaluation of the Omni OpenMP Compiler
ISHPC '00 Proceedings of the Third International Symposium on High Performance Computing
The Omni OpenMP Compiler on the Distributed Shared Memory of Cenju-4
WOMPAT '01 Proceedings of the International Workshop on OpenMP Applications and Tools: OpenMP Shared Memory Parallel Programming
CableS: Thread Control and Memory System Extensions for Shared Virtual Memory Clusters
WOMPAT '01 Proceedings of the International Workshop on OpenMP Applications and Tools: OpenMP Shared Memory Parallel Programming
Adaptive Parallelism for OpenMP Task Parallel Programs
LCR '00 Selected Papers from the 5th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Multithreaded parallelism with OpenMP
High performance scientific and engineering computing
CAS-DSM: a compiler assisted software distributed shared memory
International Journal of Parallel Programming
Optimizing NANOS OpenMP for the IBM Cyclops Multithreaded Architecture
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Performance Portability on EARTH: A Case Study across Several Parallel Architectures
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 15 - Volume 16
An efficient synchronization model for OpenMP
Journal of Parallel and Distributed Computing
A transparent runtime data distribution engine for OpenMP
Scientific Programming
Enabling scalability and performance in a large scale CMP environment
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Servo: a programming model for many-core computing
ACM SIGARCH Computer Architecture News
Mining fuzzy temporal patterns from process instances with weighted temporal graphs
International Journal of Data Analysis Techniques and Strategies
Environmental Modelling & Software
Evaluation of OpenMP for the cyclops multithreaded architecture
WOMPAT'03 Proceedings of the OpenMP applications and tools 2003 international conference on OpenMP shared memory parallel programming
Supporting realistic OpenMP applications on a commodity cluster of workstations
WOMPAT'03 Proceedings of the OpenMP applications and tools 2003 international conference on OpenMP shared memory parallel programming
IWOMP'05/IWOMP'06 Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming
Experimenting with low-overhead OpenMP runtime on IBM Blue Gene/Q
IBM Journal of Research and Development
Hi-index | 0.00 |
We describe an implementation of a sizable subset of OpenMP on networks of workstations (NOWs). By extending the availability of OpenMP to NOWs, we overcome one of its primary drawbacks compared to MPI, namely lack of portability to environments other than hardware shared memory machines. In order to support OpenMP execution on NOWs, our compiler targets a software distributed shared memory system (DSM) which provides multi-threaded execution and memory consistency.This paper presents two contributions. First, we identify two aspects of the current OpenMP standard that make an implementation on NOWs hard, and suggest simple modifications to the standard that remedy the situation. These problems reflect differences in memory architecture between software and hardware shared memory and the high cost of synchronization on NOWs. Second, we present performance results of a prototype implementation of an OpenMP subset on a NOW, and compare them with hand-coded software DSM and MPI results for the same applications on the same platform. We use five applications (ASCI Sweep3d, NAS 3D-FFT, SPLASH-2 Water, QSORT, and TSP) exhibiting various styles of parallelization, including pipelined execution, data parallelism, coarse-grained parallelism, and task queues. The measurements show little difference between OpenMP and hand-coded software DSM, but both are still lagging behind MPI. Further work will concentrate on compiler optimization to reduce these differences.