The art of computer programming, volume 2 (3rd ed.): seminumerical algorithms
The art of computer programming, volume 2 (3rd ed.): seminumerical algorithms
The ParaStation project: using workstations as building blocks for parallel computing
Information Sciences: an International Journal - special issue on parallel and distributed processing
Parallel programming in OpenMP
Parallel programming in OpenMP
Simulation Modeling and Analysis
Simulation Modeling and Analysis
Δ-stepping: a parallelizable shortest path algorithm
Journal of Algorithms
Combinatorial Algorithms: Theory and Practice
Combinatorial Algorithms: Theory and Practice
Computer science education in the 21st century
Communications of the ACM - Self managed systems
Exploiting coarse-grained task, data, and pipeline parallelism in stream programs
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Patterns for parallel programming
Patterns for parallel programming
Intel threading building blocks
Intel threading building blocks
International workshop on multicore software engineering (IWMSE 2008)
Companion of the 30th international conference on Software engineering
On-the-fly race detection in multi-threaded programs
PADTAD '08 Proceedings of the 6th workshop on Parallel and distributed systems: testing, analysis, and debugging
Auto-tuning support for manycore applications: perspectives for operating systems and compilers
ACM SIGOPS Operating Systems Review
Reducing search space of auto-tuners using parallel patterns
IWMSE '09 Proceedings of the 2009 ICSE Workshop on Multicore Software Engineering
Atune-IL: An Instrumentation Language for Auto-tuning Parallel Applications
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
XJava: Exploiting Parallelism with Object-Oriented Stream Programming
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Engineering parallel applications with tunable architectures
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
Design principles for end-to-end multicore schedulers
HotPar'10 Proceedings of the 2nd USENIX conference on Hot topics in parallelism
Concurrency by modularity: design patterns, a case in point
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
A language-based tuning mechanism for task and pipeline parallelism
Euro-Par'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part II
A framework for parallel unit testings: work in progress
Proceedings of the 48th Annual Southeast Regional Conference
High performance predictable histogramming on GPUs: exploring and evaluating algorithm trade-offs
Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units
A pattern-based verification approach for a multi-core system development
Proceedings of the 2011 ACM Symposium on Applied Computing
How do programs become more concurrent: a story of program transformations
Proceedings of the 4th International Workshop on Multicore Software Engineering
HPCS'09 Proceedings of the 23rd international conference on High Performance Computing Systems and Applications
Proceedings of the 34th International Conference on Software Engineering
How do developers use parallel libraries?
Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
Hi-index | 0.00 |
The emergence of inexpensive parallel computers powered by multicore chips combined with stagnating clock rates raises new challenges for software engineering. As future performance improvements will not come "for free" from increased clock rates, performance critical applications will need to be parallelized. However, little is known about the engineering principles for parallel general-purpose applications. This paper presents an experience report with four diverse case studies on multicore software development for general-purpose applications. They were programmed in different languages and benchmarked on several multicore computers. Empirical findings include: 1) Multicore computers deliver: Real speedups are achievable, albeit with significant programming effort and speedups that are typically lower than the number of cores employed; 2) Massive refactoring of sequential programs is required, sometimes at several levels. Special tools for parallelization refactorings appear to be an important area of research; 3) Autotuning is indispensable, as manually tuning thread assignment, number of pipeline stages, size of data partitions and other parameters is difficult and error prone; 4) Architectures that encompass several parallel components are poorly understood. Tuneable architectural patterns with parallelism at several levels need to be discovered.