An effective on-chip preloading scheme to reduce data access penalty
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Improving the accuracy of dynamic branch prediction using branch correlation
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Cache write policies and performance
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
The SPARC architecture manual (version 9)
The SPARC architecture manual (version 9)
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Implementation of precise interrupts in pipelined processors
ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
Selective cache ways: on-demand cache resource allocation
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
CHIMAERA: a high-performance architecture with a tightly-coupled reconfigurable functional unit
Proceedings of the 27th annual international symposium on Computer architecture
Timestamp snooping: an approach for extending SMPs
ACM SIGPLAN Notices
Timestamp snooping: an approach for extending SMPs
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Execution history guided instruction prefetching
ICS '02 Proceedings of the 16th international conference on Supercomputing
Direct load: dependence-linked dataflow resolution of load address and cache coordinate
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
The sun fireplane system interconnect
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Implementation of the Exponential Function in a Floating-Point Unit
Journal of VLSI Signal Processing Systems
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
The Sun Fireplane Interconnect
IEEE Micro
Access Control Mechanisms in a Distributed, Persistent Memory System
IEEE Transactions on Parallel and Distributed Systems
A Comparison of Locality-Based and Recency-Based Replacement Policies
ISHPC '00 Proceedings of the Third International Symposium on High Performance Computing
A Statistically Rigorous Approach for Improving Simulation Methodology
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Coupling Noise Analysis for VLIS and ULSI Circuits
ISQED '00 Proceedings of the 1st International Symposium on Quality of Electronic Design
Address-free memory access based on program syntax correlation of loads and stores
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on the 2001 international conference on computer design (ICCD)
Execution History Guided Instruction Prefetching
The Journal of Supercomputing
Managing Wire Delay in Large Chip-Multiprocessor Caches
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Data Centric Cache Measurement on the Intel ltanium 2 Processor
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Architecture-aware classical Taylor shift by 1
Proceedings of the 2005 international symposium on Symbolic and algebraic computation
Improving Computer Architecture Simulation Methodology by Adding Statistical Rigor
IEEE Transactions on Computers
High-performance implementations of the Descartes method
Proceedings of the 2006 international symposium on Symbolic and algebraic computation
Microarchitecture of the Godson-2 processor
Journal of Computer Science and Technology
Cache-conscious radix-decluster projections
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Journal of Computer Science and Technology
Dealing with Traffic-Area Trade-Off in Direct Coherence Protocols for Many-Core CMPs
APPT '09 Proceedings of the 8th International Symposium on Advanced Parallel Processing Technologies
IBM Journal of Research and Development
Finding representative workloads for computer system design
Finding representative workloads for computer system design
The bandwidth expansion effectiveness of cache levels block prefetch
ISHPC'05/ALPS'06 Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systems
Adaptive prefetching for shared cache based chip multiprocessors
Proceedings of the Conference on Design, Automation and Test in Europe
Evaluating OpenMP on chip multithreading platforms
IWOMP'05/IWOMP'06 Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming
Evaluation of low-overhead organizations for the directory in future many-core CMPs
Euro-Par 2010 Proceedings of the 2010 conference on Parallel processing
Hi-index | 0.00 |
Architecting and designing a high performance micro-processor requires many decisions and tradeoffs. This article presents some of the most interesting architecture decisions and design challenges encountered in designing the UltraSPARC-III microprocessor. The UltraSPARC line of microprocessors powers the entire family of Sun Microsystems computer systems, from desktop workstations to large mission critical servers. As the name implies, UltraSPARC-III is the third generation micro-processor of the SPARC Version 9 (V9) architecture (1). The V9 architecture was defined as a 64 bit extension to the original 32 bit SPARC architecture which traces it's roots to the Berkley RISC-I processor.