Using MPI (2nd ed.): portable parallel programming with the message-passing interface
Using MPI (2nd ed.): portable parallel programming with the message-passing interface
OMPI: optimizing MPI programs using partial evaluation
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
A Delay Model and Speculative Architecture for Pipelined Routers
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Guaranteeing the quality of services in networks on chip
Networks on chip
A Hardware Acceleration Unit for MPI Queue Processing
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
A Preliminary Analysis of the MPI Queue Characteristics of Several Applications
ICPP '05 Proceedings of the 2005 International Conference on Parallel Processing
Automatic generation and tuning of MPI collective communication routines
Proceedings of the 19th annual international conference on Supercomputing
An MPI prototype for compiled communication on Ethernet switched clusters
Journal of Parallel and Distributed Computing - Special issue: Design and performance of networks for super-, cluster-, and grid-computing: Part I
Performance evaluation of adaptive MPI
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
MPI Microtask for programming the cell broadband engineTM processor
IBM Systems Journal
LMPI: MPI for Heterogeneous Embedded Distributed Systems
ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 1
The M5 Simulator: Modeling Networked Systems
IEEE Micro
A Reconfigurable Cluster-on-Chip Architecture with MPI Communication Layer
FCCM '06 Proceedings of the 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
STAR-MPI: self tuned adaptive routines for MPI collective operations
Proceedings of the 20th annual international conference on Supercomputing
Programming the Intel 80-core network-on-a-chip terascale processor
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
SoC-MPI: A Flexible Message Passing Library for Multiprocessor Systems-on-Chips
RECONFIG '08 Proceedings of the 2008 International Conference on Reconfigurable Computing and FPGAs
Using application communication characteristics to drive dynamic MPI reconfiguration
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
rMPI: message passing on multicore processors with on-chip interconnect
HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
The 48-core SCC Processor: the Programmer's View
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Hardware Support for Broadcast and Reduce in MPSoC
FPL '11 Proceedings of the 2011 21st International Conference on Field Programmable Logic and Applications
MPI/CTP: a reconfigurable MPI for HPC applications
EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
An architecture for reconfigurable iterative MPI applications in dynamic environments
PPAM'05 Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics
High-performance RMA-based broadcast on the intel SCC
Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
Hi-index | 0.00 |
Multicore designs have emerged as the dominant organization for future high-performance microprocessors. Communication in such designs is often enabled by Networks-on-Chip (NoCs). A new trend in such architectures is to fit a Message Passing Interface (MPI) programming model on NoCs to achieve optimal parallel application performance. A key issue in designing MPI over NoCs is communication protocol, which has not been explored in previous research. This article advocates a hardware-supported communication mechanism using a protocol-adaptive approach to adjust to varying NoC configurations (e.g., number of buffers) and workload behavior (e.g., number of messages). We propose the ADaptive Communication Mechanism (ADCM), a hybrid protocol that involves behavior similar to buffered communication when sufficient buffer is available in the receiver to that similar to a synchronous protocol when buffers in the receiver are limited. ADCM adapts dynamically by deciding communication protocol on a per-request basis using a local estimate of recent buffer utilization. ADCM attempts to combine both the advantages of buffered and synchronous communication modes to achieve enhanced throughput and performance. Simulations of various workloads show that the proposed communication mechanism can be effectively used in future NoC designs.