Billion-Transistor Architectures: There and Back Again

Authors:
Doug Burger;James R. Goodman
Affiliations:
-;-
Venue:
Computer
Year:
2004

Citing 25
Cited 14

A bandwidth-efficient architecture for media processing

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Simultaneous subordinate microthreading (SSMT)

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
DIVA: a reliable substrate for deep submicron microarchitecture design

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Transient fault detection via simultaneous multithreading

Proceedings of the 27th annual international symposium on Computer architecture
A study of slipstream processors

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
The optimum pipeline depth for a microprocessor

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
The optimal logic depth per pipeline stage is 6 to 8 FO4 inverter delays

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Speculative lock elision: enabling highly concurrent multithreaded execution

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Transactional lock-free execution of lock-based programs

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Mondrian memory protection

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Will Physical Scalability Sabotage Performance Gains?

Computer
Walk-Time Techniques: Catalyst for Architectural Change

Computer
How Multimedia Workloads Will Change Processor Design

Computer
Billion-Transistor Architectures

Computer
One Billion Transistors, One Uniprocessor, One Chip

Computer
Superspeculative Microarchitecture for Beyond AD 2000

Computer
Trace Processors: Moving to Fourth-Generation Microarchitectures

Computer
Scalable Processors in the Billion-Transistor Era: IRAM

Computer
A Single-Chip Multiprocessor

Computer
Baring It All to Software: Raw Machines

Computer
Master/slave speculative parallelization

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
AR-SMT: A Microarchitectural Approach to Fault Tolerance in Microprocessors

FTCS '99 Proceedings of the Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing
Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture

Proceedings of the 30th annual international symposium on Computer architecture
Speculative Data-Driven Multithreading

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture

Database hash-join algorithms on multithreaded computer architectures

Proceedings of the 3rd conference on Computing frontiers
Silicon CMOS devices beyond scaling

IBM Journal of Research and Development - Advanced silicon technology
A flexible data to L2 cache mapping approach for future multicore processors

Proceedings of the 2006 workshop on Memory system performance and correctness
Managing Distributed, Shared L2 Caches through OS-Level Page Allocation

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Accelerating sequential programs on Chip Multiprocessors via Dynamic Prefetching Thread

Microprocessors & Microsystems
Pipelined hash-join on multithreaded architectures

DaMoN '07 Proceedings of the 3rd international workshop on Data management on new hardware
Evaluation of bus based interconnect mechanisms in clustered VLIW architectures

International Journal of Parallel Programming
Research on Evaluation of Parallelization on an Embedded Multicore Platform

APPT '09 Proceedings of the 8th International Symposium on Advanced Parallel Processing Technologies
Increasing throughput of a RISC architecture using arithmetic data value speculation

Asilomar'09 Proceedings of the 43rd Asilomar conference on Signals, systems and computers
Product families on-chip: combining the software product family paradigm with run-time reprogrammable hardware technology

COMPSAC-W'05 Proceedings of the 29th annual international conference on Computer software and applications conference
The design space of CMP vs. SMT for high performance embedded processor

ICESS'05 Proceedings of the Second international conference on Embedded Software and Systems
A hybrid hardware/software generated prefetching thread mechanism on chip multiprocessors

Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Exploiting multilevel parallelism within modern microprocessors: DWT as a case study

VECPAR'04 Proceedings of the 6th international conference on High Performance Computing for Computational Science
A memory bandwidth effective cache store miss policy

ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture

Quantified Score

Hi-index	4.10

Visualization

Abstract

In September 1997, Computer published a special issue on billion-transistor microprocessor architectures. Comparing that issue's predictions about the trends that would drive architectural development with the factors that subsequently emerged shows a greater-than predicted emphasis on clock speed and an unforeseen importance of power constraints. From seven architectural visions proposed in 1997, none has yet emerged as dominant. However, as we approach a microrarchitectural bound on clock speed, the primary source of improved performance must come from increased concurrency. Future billion-transistor architectures will be judged by how efficiently they support distributed hardware without placing intractable demands on programmers.