Available instruction-level parallelism for superscalar and superpipelined machines
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
POWER2: next generation of the RISC System/6000 family
IBM Journal of Research and Development
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Spert-II: A Vector Microprocessor System
Computer - Special issue: neural computing: companion issue to Spring 1996 IEEE Computational Science & Engineering
Missing the memory wall: the case for processor/memory integration
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Complexity-effective superscalar processors
Proceedings of the 24th annual international symposium on Computer architecture
Proceedings of the 24th annual international symposium on Computer architecture
Out-of-order vector architectures
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
IEEE Micro
IEEE Micro
Multithreaded Vector Architectures
HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture
Vector architectures: past, present and future
ICS '98 Proceedings of the 12th international conference on Supercomputing
Tarantula: a vector extension to the alpha architecture
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
A survey of processors with explicit multithreading
ACM Computing Surveys (CSUR)
High-bandwidth Address Generation Unit
Journal of Signal Processing Systems
Instruction merging to increase parallelism in VLIW architectures
SOC'09 Proceedings of the 11th international conference on System-on-chip
ARCS'07 Proceedings of the 20th international conference on Architecture of computing systems
High-bandwidth address generation unit
SAMOS'07 Proceedings of the 7th international conference on Embedded computer systems: architectures, modeling, and simulation
ICC'09 Proceedings of the 2009 IEEE international conference on Communications
Hi-index | 0.00 |
Historically, there have been two different approaches to high performance computing: instruction-level parallelism (ILP) and data-level parallelism (DLP). The ILP paradigm seeks to execute several instructions each cycle by exploring a sequential instruction stream and extracting independent instructions that can be sent to several execution units in parallel. The DLP paradigm, on the other hand, uses vectorization techniques to specify with a single instruction (a vector instruction) a large number of operations to be performed on independent data. A few of these vector instructions running concurrently can provide a large operation parallelism for many consecutive cycles.