PVM: a framework for parallel distributed computing
Concurrency: Practice and Experience
Parallel Computer Architecture: A Hardware/Software Approach
Parallel Computer Architecture: A Hardware/Software Approach
Issues in using heterogeneous HPC systems for embedded real time signal processing applications
RTCSA '95 Proceedings of the 2nd International Workshop on Real-Time Computing Systems and Applications
Queue - Multiprocessors
HeteroMPI: Towards a message-passing library for heterogeneous networks of computers
Journal of Parallel and Distributed Computing
Practical fpga programming in c
Practical fpga programming in c
Proceedings of the Third International Workshop on High-Performance Reconfigurable Computing Technology and Applications
Bridging parallel and reconfigurable computing with multilevel PGAS and SHMEM+
Proceedings of the Third International Workshop on High-Performance Reconfigurable Computing Technology and Applications
Characterization of Fixed and Reconfigurable Multi-Core Devices for Application Acceleration
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
CUDA by Example: An Introduction to General-Purpose GPU Programming
CUDA by Example: An Introduction to General-Purpose GPU Programming
Auto-pipe and the X language: a pipeline design tool and description language
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
SHMEM+: A multilevel-PGAS programming model for reconfigurable supercomputing
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
HyMPI – a MPI implementation for heterogeneous high performance systems
GPC'06 Proceedings of the First international conference on Advances in Grid and Pervasive Computing
RCML: An Environment for Estimation Modeling of Reconfigurable Computing Systems
ACM Transactions on Embedded Computing Systems (TECS) - Special Section on CAPA'09, Special Section on WHS'09, and Special Section VCPSS' 09
Hi-index | 0.00 |
Heterogeneous computing systems comprised of accelerators such as FPGAs, GPUs, and manycore processors coupled with standard microprocessors are becoming an increasingly popular solution for future computing systems due to their higher performance and energy efficiency. Although programming languages and tools are evolving to simplify device-level design, programming such systems is still difficult and time-consuming largely due to system-wide challenges involving communication between heterogeneous devices, which currently require ad hoc solutions. Most communication frameworks and APIs which have dominated parallel application development for decades were developed for homogeneous systems, and hence cannot be directly employed for hybrid systems. To solve this problem, this article presents the System Coordination Framework (SCF), which employs message passing to transparently enable communication between tasks described using different programming tools (and languages), and running on heterogeneous processing devices of systems from domains ranging from embedded systems to High-Performance Computing (HPC) systems. By hiding low-level architectural details of the underlying communication from an application designer, SCF can improve application development productivity, provide higher levels of application portability, and offer rapid design-space exploration of different task/device mappings. In addition, SCF enables custom communication synthesis that exploits mechanisms specific to different devices and platforms, which can provide performance improvements over generic solutions employed previously. Our results indicate a performance improvement of 28× and 682× by employing FPGA devices for two applications presented in this article, while simultaneously improving the developer productivity by approximately 2.5 to 5 times by using SCF.