A bandwidth-efficient architecture for media processing
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Simultaneous subordinate microthreading (SSMT)
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
DIVA: a reliable substrate for deep submicron microarchitecture design
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Transient fault detection via simultaneous multithreading
Proceedings of the 27th annual international symposium on Computer architecture
A study of slipstream processors
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
The optimum pipeline depth for a microprocessor
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
The optimal logic depth per pipeline stage is 6 to 8 FO4 inverter delays
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Speculative lock elision: enabling highly concurrent multithreaded execution
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Transactional lock-free execution of lock-based programs
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Billion-Transistor Architectures
Computer
Computer
Master/slave speculative parallelization
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
AR-SMT: A Microarchitectural Approach to Fault Tolerance in Microprocessors
FTCS '99 Proceedings of the Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing
Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture
Proceedings of the 30th annual international symposium on Computer architecture
Speculative Data-Driven Multithreading
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Database hash-join algorithms on multithreaded computer architectures
Proceedings of the 3rd conference on Computing frontiers
Silicon CMOS devices beyond scaling
IBM Journal of Research and Development - Advanced silicon technology
A flexible data to L2 cache mapping approach for future multicore processors
Proceedings of the 2006 workshop on Memory system performance and correctness
Managing Distributed, Shared L2 Caches through OS-Level Page Allocation
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Accelerating sequential programs on Chip Multiprocessors via Dynamic Prefetching Thread
Microprocessors & Microsystems
Pipelined hash-join on multithreaded architectures
DaMoN '07 Proceedings of the 3rd international workshop on Data management on new hardware
Evaluation of bus based interconnect mechanisms in clustered VLIW architectures
International Journal of Parallel Programming
Research on Evaluation of Parallelization on an Embedded Multicore Platform
APPT '09 Proceedings of the 8th International Symposium on Advanced Parallel Processing Technologies
Increasing throughput of a RISC architecture using arithmetic data value speculation
Asilomar'09 Proceedings of the 43rd Asilomar conference on Signals, systems and computers
COMPSAC-W'05 Proceedings of the 29th annual international conference on Computer software and applications conference
The design space of CMP vs. SMT for high performance embedded processor
ICESS'05 Proceedings of the Second international conference on Embedded Software and Systems
A hybrid hardware/software generated prefetching thread mechanism on chip multiprocessors
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Exploiting multilevel parallelism within modern microprocessors: DWT as a case study
VECPAR'04 Proceedings of the 6th international conference on High Performance Computing for Computational Science
A memory bandwidth effective cache store miss policy
ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture
Hi-index | 4.10 |
In September 1997, Computer published a special issue on billion-transistor microprocessor architectures. Comparing that issue's predictions about the trends that would drive architectural development with the factors that subsequently emerged shows a greater-than predicted emphasis on clock speed and an unforeseen importance of power constraints. From seven architectural visions proposed in 1997, none has yet emerged as dominant. However, as we approach a microrarchitectural bound on clock speed, the primary source of improved performance must come from increased concurrency. Future billion-transistor architectures will be judged by how efficiently they support distributed hardware without placing intractable demands on programmers.