Runtime Code Parallelization for On-Chip Multiprocessors

Authors:
M. Kandemir;W. Zhang;M. Karakoy
Affiliations:
Penn State University;Penn State University;Imperial College
Venue:
DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Year:
2003

Citing 9
Cited 2

Automatic and interactive parallelization

Automatic and interactive parallelization
System-level power optimization: techniques and tools

ACM Transactions on Design Automation of Electronic Systems (TODAES)
The design and use of simplepower: a cycle-accurate energy estimation tool

Proceedings of the 37th Annual Design Automation Conference
Memory hierarchy reconfiguration for energy and performance in general-purpose processor architectures

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Power aware microarchitecture resource scaling

Proceedings of the conference on Design, automation and test in Europe
Performance analysis using the MIPS R10000 performance counters

Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
An integer linear programming based approach for parallelizing applications in On-chip multiprocessors

Proceedings of the 39th annual Design Automation Conference
Design of High-Performance Microprocessor Circuits

Design of High-Performance Microprocessor Circuits
High Performance Compilers for Parallel Computing

High Performance Compilers for Parallel Computing

Online strategies for high-performance power-aware thread execution on emerging multiprocessors

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Shared Register File Based ILP for Multicore

GREENCOM-CPSCOM '10 Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Chip multiprocessing (or multiprocessor system-on-a-chip) is a technique that combines two or more processor cores on a single piece of silicon to enhance computing performance. An important problem to be addressed in executing applications on an on-chip multiprocessor environment is to select the most suitable number of processors to use for a given objective function (e.g., minimizing execution time or energy-delay product) under multiple constraints. Previous research proposed an ILP-based solution to this problem that is based on exhaustive evaluation of each nest under all possible processor sizes. In this paper, we take a different approach and propose a pure runtime strategy for determining the best number of processors to use at run-time. This approach is more general than static techniques and can be applicable in situations where the latter cannot be.