Characterizing the impact of using spare-cores on application performance

Authors:
José Carlos Sancho;Darren J. Kerbyson;Michael Lang
Affiliations:
Barcelona Supercomputing Center, Barcelona, Spain;Pacific Northwest National Laboratory, Richland, WA;Los Alamos National Laboratory, Los Alamos, NM
Venue:
EuroPar'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part I
Year:
2010

Citing 6
Cited 3

Predictive performance and scalability modeling of a large-scale application

Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Dual-Core Execution: Building a Highly Scalable Single-Thread Instruction Window

Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Future Execution: A Hardware Prefetching Technique for Chip Multiprocessors

Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Adapting to intermittent faults in multicore systems

Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Decoupling dynamic program analysis from execution in virtual environments

ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
3D chip-stacking technology with through-silicon vias and low-volume lead-free interconnections

IBM Journal of Research and Development

Stepping towards noiseless Linux environment

Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers
Understanding and isolating the noise in the Linux kernel

International Journal of High Performance Computing Applications
Juggle: addressing extrinsic load imbalances in SPMD applications on multicore computers

Cluster Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Increased parallelism on a single processor is driving improvements in peak-performance at both the node and system levels. However achievable performance, in particular from production scientific applications, is not always directly proportional to the core count. Performance is often limited by constraints in the memory hierarchy and also by a node inter-connectivity. Even on state-of-the-art processors, containing between four and eight cores, many applications cannot take full advantage of the compute-performance of all cores. This trend is expected to increase on future processors as the core count per processor increases. In this work we characterize the use of spare-cores, cores that do not provide any improvements in application performance, on current multi-core processors. By using a pulse-width modulation method, we examine the possible performance profile of using a spare-core and quantify under what situations its use will not impact application performance. We show that, for current AMD and Intel multi-core processors, sparecores can be used for substantial computational tasks but can impact application performance when using shared caches or when significantly accessing main memory.