Communications of the ACM - Special issue on parallelism
SPLASH: Stanford parallel applications for shared-memory
ACM SIGARCH Computer Architecture News
Runtime compilation techniques for data partitioning and communication schedule reuse
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Journal of Parallel and Distributed Computing - Special issue on scalability of parallel algorithms and architectures
The Stanford FLASH multiprocessor
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Tempest and typhoon: user-level shared memory
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Where is time spent in message-passing and shared-memory programs?
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Fine-grain access control for distributed shared memory
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
A parallel software infrastructure for structured adaptive mesh methods
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Interprocedural compilation of irregular applications for distributed memory machines
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Architectural mechanisms for explicit communication in shared memory multiprocessors
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Teapot: language support for writing memory coherence protocols
PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Run-time and compile-time support for adaptive irregular problems
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Application-specific protocols for user-level shared memory
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Eliminating Barrier Synchronization for Compiler-Parallelized Codes on Software DSMs
International Journal of Parallel Programming
Enhancing Software DSM for Compiler-Parallelized Applications
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Compile-time Synchronization Optimizations for Software DSMs
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Switch Design to Enable Predictive Multiplexed Switching in Multiprocessor Networks
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Combined compile-time and runtime-driven, pro-active data movement in software DSM systems
LCR '04 Proceedings of the 7th workshop on Workshop on languages, compilers, and run-time support for scalable systems
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.00 |
Many scientific applications are iterative and specify repetitive communication patterns. This paper shows how a parallel-language compiler and custom cache-coherence protocols in a distributed shared memory system together can implement shared-memory communication efficiently for applications with unpredictable but repetitive communication patterns. The compiler uses data-flow analysis to identify program points where repetitive communication occurs. At runtime, the custom protocol builds communication schedules in one iteration and uses it to pre-send data in following iterations. This paper contains measurements on three iterative applications (including adaptive programs with unstructured data accesses) to show that custom protocols increase the number of shared-data requests satisfied locally, thus reducing the amount of time spent waiting for remote data.