Auto-tuning support for manycore applications: perspectives for operating systems and compilers
ACM SIGOPS Operating Systems Review
Atune-IL: An Instrumentation Language for Auto-tuning Parallel Applications
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
XJava: Exploiting Parallelism with Object-Oriented Stream Programming
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Engineering parallel applications with tunable architectures
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
A stream-computing extension to OpenMP
Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
Run-time automatic performance tuning for multicore applications
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Exploiting cache traffic monitoring for run-time race detection
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Parallelizing an index generator for desktop search
ISCA'10 Proceedings of the 2010 international conference on Computer Architecture
Proceedings of the 34th International Conference on Software Engineering
Hi-index | 0.00 |
As multicore computers become mainstream and the demand for parallel software increases, software developers need to know which approaches to parallelism work. A case study in which four teams competitively parallelized the Bzip2 compression algorithm illustrates the difficulties that arise when working with a nonnumeric, real application. The sequential code needed significant restructuring before parallelization could begin; restructuring consumed most of the development time. Parallelization at a high level resulted in significant speedups. Low-level, inner-loop parallelizations performed poorly. The case study yielded several other lessons learned.