A column pre-ordering strategy for the unsymmetric-pattern multifrontal method
ACM Transactions on Mathematical Software (TOMS)
Brook for GPUs: stream computing on graphics hardware
ACM SIGGRAPH 2004 Papers
ACM SIGGRAPH 2004 Papers
GPU Cluster for High Performance Computing
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Hardware-oriented numerics and concepts for PDE software
Future Generation Computer Systems
Accelerator: using data parallelism to program GPUs for general-purpose uses
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Extended-precision floating-point numbers for GPU computation
ACM SIGGRAPH 2006 Research posters
A performance-oriented data parallel virtual machine for GPUs
ACM SIGGRAPH 2006 Sketches
Proceedings of the 22nd ACM SIGGRAPH/EUROGRAPHICS symposium on Graphics hardware
International Journal of Parallel, Emergent and Distributed Systems
Using GPUs to improve multigrid solver performance on a cluster
International Journal of Computational Science and Engineering
Adapting a message-driven parallel application to GPU-accelerated clusters
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Hardware-accelerated components for hybrid computing systems
Proceedings of the 2008 compFrame/HPC-GECO workshop on Component based high performance
Journal of Parallel and Distributed Computing
Supporting MapReduce on large-scale asymmetric multi-core clusters
ACM SIGOPS Operating Systems Review
Integrated Digital Image Correlation for the Identification of Mechanical Properties
MIRAGE '09 Proceedings of the 4th International Conference on Computer Vision/Computer Graphics CollaborationTechniques
Probing biomolecular machines with graphics processors
Communications of the ACM - A View of Parallel Computing
Probing Biomolecular Machines with Graphics Processors
Queue - Bioscience
Co-processor acceleration of an unmodified parallel solid mechanics code with FEASTGPU
International Journal of Computational Science and Engineering
A comparison of three parallelisation methods for 2D flood inundation models
Environmental Modelling & Software
State-of-the-art in heterogeneous computing
Scientific Programming
High-order finite-element seismic wave propagation modeling with MPI on a large GPU cluster
Journal of Computational Physics
Designing Accelerator-Based Distributed Systems for High Performance
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
A Capabilities-Aware Programming Model for Asymmetric High-End Systems
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
A capabilities-aware framework for using computational accelerators in data-intensive computing
Journal of Parallel and Distributed Computing
Analysis of Parallel Algorithms for Energy Conservation with GPU
GREENCOM-CPSCOM '10 Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social Computing
Reusable software components for accelerator-based clusters
Journal of Systems and Software
Advances in Engineering Software
Simulation of multistage excavation based on a 3D spectral-element method
Computers and Structures
C-DAC's efforts: application kernels on HPC cluster with GPU accelerators
Proceedings of the ATIP/A*CRC Workshop on Accelerator Technologies for High-Performance Computing: Does Asia Lead the Way?
Performance evaluation of OpenMP and CUDA on multicore systems
ICA3PP'12 Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part II
Power and Performance Management of GPUs Based Cluster
International Journal of Cloud Applications and Computing
Energy cost evaluation of parallel algorithms for multiprocessor systems
Cluster Computing
Vectorized OpenCL implementation of numerical integration for higher order finite elements
Computers & Mathematics with Applications
Accelerated finite element elastodynamic simulations using the GPU
Journal of Computational Physics
Numerical integration on GPUs for higher order finite elements
Computers & Mathematics with Applications
Hi-index | 0.01 |
The first part of this paper surveys co-processor approaches for commodity based clusters in general, not only with respect to raw performance, but also in view of their system integration and power consumption. We then extend previous work on a small GPU cluster by exploring the heterogeneous hardware approach for a large-scale system with up to 160 nodes. Starting with a conventional commodity based cluster we leverage the high bandwidth of graphics processing units (GPUs) to increase the overall system bandwidth that is the decisive performance factor in this scenario. Thus, even the addition of low-end, out of date GPUs leads to improvements in both performance- and power-related metrics.