Energy-efficient execution of dense linear algebra algorithms on multi-core processors

Authors:
Pedro Alonso;Manuel F. Dolz;Rafael Mayo;Enrique S. Quintana-Ortí
Affiliations:
Dep. de Sistemas Informáticos y Computación, Universitat Politècnica de València, Valencia, Spain 46022;Dep. de Ingeniería y Ciencia de los Computadores, Universitat Jaume I, Castellón, Spain 12071;Dep. de Ingeniería y Ciencia de los Computadores, Universitat Jaume I, Castellón, Spain 12071;Dep. de Ingeniería y Ciencia de los Computadores, Universitat Jaume I, Castellón, Spain 12071
Venue:
Cluster Computing
Year:
2013

Citing 21
Cited 0

Matrix computations (3rd ed.)

Matrix computations (3rd ed.)
LAPACK Users' guide (third ed.)

LAPACK Users' guide (third ed.)
LEneS: task scheduling for low-energy systems using variable supply voltage processors

Proceedings of the 2001 Asia and South Pacific Design Automation Conference
Task scheduling and voltage selection for energy minimization

Proceedings of the 39th annual Design Automation Conference
Combined dynamic voltage scaling and adaptive body biasing for lower power microprocessors under dynamic workloads

Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
A scheduling model for reduced CPU energy

FOCS '95 Proceedings of the 36th Annual Symposium on Foundations of Computer Science
Parallel out-of-core computation and updating of the QR factorization

ACM Transactions on Mathematical Software (TOMS)
An efficient list scheduling algorithm for time placement problem

Computers and Electrical Engineering
Variable voltage task scheduling for minimizing energy or minimizing power

ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 06
List scheduling for jobs with arbitrary release times and similar lengths

Journal of Scheduling
Green Supercomputing Comes of Age

IT Professional
Programming matrix algorithms-by-blocks for thread-level parallelism

ACM Transactions on Mathematical Software (TOMS)
Minimizing Energy Consumption for Precedence-Constrained Applications Using Dynamic Voltage Scaling

CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
Energy-efficient algorithms

Communications of the ACM
Energy aware scheduling for DAG structured applications on heterogeneous and DVS enabled processors

GREENCOMP '10 Proceedings of the International Conference on Green Computing
Stretch and compress based re-scheduling techniques for minimizing the execution times of DAGs on multi-core processors under energy constraints

GREENCOMP '10 Proceedings of the International Conference on Green Computing
The future of microprocessors

Communications of the ACM
The International Exascale Software Project roadmap

International Journal of High Performance Computing Applications
Dark silicon and the end of multicore scaling

Proceedings of the 38th annual international symposium on Computer architecture
Proceedings of the 13th international conference on High Performance Computing

HiPC'06 Proceedings of the 13th international conference on High Performance Computing
DVFS-control techniques for dense linear algebra operations on multi-core processors

Computer Science - Research and Development

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper addresses the efficient exploitation of task-level parallelism, present in many dense linear algebra operations, from the point of view of both computational performance and energy consumption. The strategies considered here, referred to as the Slack Reduction Algorithm (SRA) and the Race-to-Idle Algorithm (RIA), adjust the operation frequency of the cores during the execution of a collection of tasks (in which many dense linear algebra algorithms can be decomposed) with very different approaches to save energy. The procedures are evaluated using an energy-aware simulator, which is in charge of scheduling/mapping the execution of these tasks to the cores, leveraging dynamic frequency voltage scaling featured by current technology. Experiments with this tool and the practical integration of the RIA strategy into a runtime show the energy gains for two versions of the QR factorization.