Linear algebra operators for GPU implementation of numerical algorithms
ACM SIGGRAPH 2003 Papers
Performance implications of single thread migration on a chip multi-core
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
The potential of the cell processor for scientific computing
Proceedings of the 3rd conference on Computing frontiers
Sequoia: programming the memory hierarchy
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Sequoia: programming the memory hierarchy
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
High Performance Computing in the Multi-core Area
ISPDC '07 Proceedings of the Sixth International Symposium on Parallel and Distributed Computing
Accelerating advanced MRI reconstructions on GPUs
Journal of Parallel and Distributed Computing
Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Roofline: an insightful visual performance model for multicore architectures
Communications of the ACM - A Direct Path to Dependable Software
Using many-core hardware to correlate radio astronomy signals
Proceedings of the 23rd international conference on Supercomputing
Molecular dynamics simulations on commodity GPUs with CUDA
HiPC'07 Proceedings of the 14th international conference on High performance computing
Multicore is bad news for supercomputers
IEEE Spectrum
State-of-the-art in heterogeneous computing
Scientific Programming
Tips, tricks and troubles: optimizing for cell and GPU
Proceedings of the 20th international workshop on Network and operating systems support for digital audio and video
Evaluation of streaming aggregation on parallel hardware architectures
Proceedings of the Fourth ACM International Conference on Distributed Event-Based Systems
A survey on hardware-aware and heterogeneous computing on multicore processors and accelerators
Concurrency and Computation: Practice & Experience
An efficient work-distribution strategy for gridding radio-telescope data on GPUs
Proceedings of the 26th ACM international conference on Supercomputing
Hi-index | 0.00 |
Multi-core platforms have proven themselves able to accelerate numerous HPC applications. But programming data-intensive applications on such platforms is a hard, and not yet solved, problem. Not only do modern processors favor compute-intensive code, they also have diverse architectures and incompatible programming models. And even after making a difficult platform choice, extensive programming effort must be invested with an uncertain performance outcome. By taking the plunge on an irregular, data-intensive application, we present an evaluation of three platform types, namely the generic multi-core CPU, the STI Cell/B.E., and the GPU. We evaluate these platforms in terms of application performance, programming effort and cost. Although we do not select a clear winner, we do provide a list of guidelines to assist in platform choice and development of similar data-intensive applications.