Exploiting Instruction- and Data-Level Parallelism

Authors:
Roger Espasa;Mateo Valero
Affiliations:
-;-
Venue:
IEEE Micro
Year:
1997

Citing 12
Cited 8

Available instruction-level parallelism for superscalar and superpipelined machines

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
POWER2: next generation of the RISC System/6000 family

IBM Journal of Research and Development
Multiscalar processors

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Spert-II: A Vector Microprocessor System

Computer - Special issue: neural computing: companion issue to Spring 1996 IEEE Computational Science & Engineering
Missing the memory wall: the case for processor/memory integration

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Exploiting choice: instruction fetch and issue on an implementable simultaneous multithreading processor

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Complexity-effective superscalar processors

Proceedings of the 24th annual international symposium on Computer architecture
DataScalar architectures

Proceedings of the 24th annual international symposium on Computer architecture
Out-of-order vector architectures

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
The Future of Microprocessors

IEEE Micro
A Case for Intelligent RAM

IEEE Micro
Multithreaded Vector Architectures

HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture

Vector architectures: past, present and future

ICS '98 Proceedings of the 12th international conference on Supercomputing
Tarantula: a vector extension to the alpha architecture

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
A survey of processors with explicit multithreading

ACM Computing Surveys (CSUR)
High-bandwidth Address Generation Unit

Journal of Signal Processing Systems
Instruction merging to increase parallelism in VLIW architectures

SOC'09 Proceedings of the 11th international conference on System-on-chip
Design space exploration of media processors: a generic vliw architecture and a parameterized scheduler

ARCS'07 Proceedings of the 20th international conference on Architecture of computing systems
High-bandwidth address generation unit

SAMOS'07 Proceedings of the 7th international conference on Embedded computer systems: architectures, modeling, and simulation
Efficient implementation of binary sequence generator for WiMAX and WRAN on programmable digital signal processor

ICC'09 Proceedings of the 2009 IEEE international conference on Communications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Historically, there have been two different approaches to high performance computing: instruction-level parallelism (ILP) and data-level parallelism (DLP). The ILP paradigm seeks to execute several instructions each cycle by exploring a sequential instruction stream and extracting independent instructions that can be sent to several execution units in parallel. The DLP paradigm, on the other hand, uses vectorization techniques to specify with a single instruction (a vector instruction) a large number of operations to be performed on independent data. A few of these vector instructions running concurrently can provide a large operation parallelism for many consecutive cycles.