NAMD: biomolecular simulation on thousands of processors
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Hardware acceleration for spatial selections and joins
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Fast computation of database operations using graphics processors
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Fast and approximate stream mining of quantiles and frequencies using graphics processors
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
GPGPU: general purpose computation on graphics hardware
ACM SIGGRAPH 2004 Course Notes
GPUTeraSort: high performance graphics co-processor sorting for large database management
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
SPEC CPU2006 benchmark descriptions
ACM SIGARCH Computer Architecture News
PBPI: a high performance implementation of Bayesian phylogenetic inference
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Parallel Programming in C with MPI and OpenMP
Parallel Programming in C with MPI and OpenMP
Dynamic multigrain parallelization on the cell broadband engine
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Executing stream joins on the cell processor
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Introduction to the cell broadband engine architecture
IBM Journal of Research and Development
Large-scale maximum likelihood-based phylogenetic analysis on the IBM BlueGene/L
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Adapting a message-driven parallel application to GPU-accelerated clusters
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
PRIB '08 Proceedings of the Third IAPR International Conference on Pattern Recognition in Bioinformatics
Programming the Cell Broadband Engine Architecture: Examples and Best Practices
Programming the Cell Broadband Engine Architecture: Examples and Best Practices
Data parallel acceleration of decision support queries using Cell/BE and GPUs
Proceedings of the 6th ACM conference on Computing frontiers
Many-core algorithms for statistical phylogenetics
Bioinformatics
Memory Performance and Cache Coherency Effects on an Intel Nehalem Multiprocessor System
PACT '09 Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques
ICPP '09 Proceedings of the 2009 International Conference on Parallel Processing
The Scalable Heterogeneous Computing (SHOC) benchmark suite
Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units
Accelerating SQL database operations on a GPU with CUDA
Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units
Iterative induced dipoles computation for molecular mechanics on GPUs
Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units
PPAM'09 Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part II
Application Acceleration with the Cell Broadband Engine
Computing in Science and Engineering
Parallel partitioning for distributed systems using sequential assignment
Journal of Parallel and Distributed Computing
Graphics Processing Units and Open Computing Language for parallel computing
Computers and Electrical Engineering
Hi-index | 0.00 |
Currently, we are facing a situation where applications exhibit increasing computational demands and where a large variety of parallel processor systems are available. In this paper we focus on exploiting fine-grain parallelism for three applications with distinct characteristics: a Bioinformatics application (MrBayes), a Molecular Dynamics application (NAMD), and a database application (TPC-H). We assess, side-by-side, the performance of the three applications on general-purpose multi-core processors, the Cell Broadband Engine (Cell/BE), and Graphics Processing Units (GPU). Our results indicate that application performance depends on the characteristics of the parallel architectures and on the computational requirements of the core functions of the respective applications. For MrBayes the best overall performance is achieved on general-purpose multi-core processors, for NAMD on the Cell/BE, and for TPC-H on GPUs.