A computational approach to edge detection
Readings in computer vision: issues, problems, principles, and paradigms
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Assessing the potential of hybrid hpc systems for scientific applications: a case study
Proceedings of the 4th international conference on Computing frontiers
Programming an FPGA-based Super Computer Using a C-to-VHDL Compiler: DIME-C
AHS '07 Proceedings of the Second NASA/ESA Conference on Adaptive Hardware and Systems
Cell broadband engine processor: design and implementation
IBM Journal of Research and Development
OpenMP to GPGPU: a compiler framework for automatic translation and optimization
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Online Risk Analytics on the Cloud
CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
Enabling CUDA acceleration within virtual machines using rCUDA
HIPC '11 Proceedings of the 2011 18th International Conference on High Performance Computing
Hi-index | 0.00 |
Application accelerators can include GPUs, cell processors, FPGAs and other custom application specific integrated circuit (ASICs) based devices. A number of challenges arise when these devices must be integrated as part of a single computing environment, relating to both the diversity of devices and the supported programming models. One key challenge we consider here is the selection of the most appropriate device for accelerating a particular application. Our approach makes use of a broker-based matchmaking system, which attempts to compare the capability of a device with one or more application kernels, utilising the CometCloud tuple space-based coordination mechanism to facilitate the matchmaking process. We describe the architecture of our system and how it utilises performance prediction to select devices for particular application kernels. We demonstrate that within a highly dynamic HPC system, our approach can increase the performance of applications by using code porting techniques to the most suitable device found, also; (a) allowing the dynamic addition of new devices to the system, and (b) allowing applications to fall back and utilise the best alternative device available if the preferred device cannot be found or is unavailable.