Eliminating the address translation bottleneck for physical address cache
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Intrinsic MOSFET parameter fluctuations due to random dopant placement
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special issue on low power electronics and design
Pipeline gating: speculation control for energy reduction
Proceedings of the 25th annual international symposium on Computer architecture
Basic Block Distribution Analysis to Find Periodic Behavior and Simulation Points in Applications
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Generating physical addresses directly for saving instruction TLB energy
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Parameter variations and impact on circuits and microarchitecture
Proceedings of the 40th annual Design Automation Conference
PADded Cache: A New Fault-Tolerance Technique for Cache Memories
VTS '99 Proceedings of the 1999 17TH IEEE VLSI Test Symposium
Picking Statistically Valid and Early Simulation Points
Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on low power
A methodology to improve timing yield in the presence of process variations
Proceedings of the 41st annual Design Automation Conference
Wire Delay is Not a Problem for SMT (In the Near Future)
Proceedings of the 31st annual international symposium on Computer architecture
Block-based Static Timing Analysis with Uncertainty
Proceedings of the 2003 IEEE/ACM international conference on Computer-aided design
Statistical Timing Analysis for Intra-Die Process Variations with Spatial Correlations
Proceedings of the 2003 IEEE/ACM international conference on Computer-aided design
A process-tolerant cache architecture for improved yield in nanoscale technologies
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Correlation-aware statistical timing analysis with non-gaussian delay distributions
Proceedings of the 42nd annual Design Automation Conference
Full-chip analysis of leakage power under process variations, including spatial correlations
Proceedings of the 42nd annual Design Automation Conference
Modeling and Testing of SRAM for New Failure Mechanisms Due to Process Variations in Nanoscale CMOS
VTS '05 Proceedings of the 23rd IEEE Symposium on VLSI Test
CODES+ISSS '05 Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Process and environmental variation impacts on ASIC timing
Proceedings of the 2004 IEEE/ACM International conference on Computer-aided design
Statistical gate sizing for timing yield optimization
ICCAD '05 Proceedings of the 2005 IEEE/ACM International conference on Computer-aided design
Process variation aware cache leakage management
Proceedings of the 2006 international symposium on Low power electronics and design
Yield-Aware Cache Architectures
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Mitigating the Impact of Process Variations on Processor Register Files and Execution Units
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
ReCycle:: pipeline adaptation to tolerate process variation
Proceedings of the 34th annual international symposium on Computer architecture
A Model for Timing Errors in Processors with Parameter Variation
ISQED '07 Proceedings of the 8th International Symposium on Quality Electronic Design
Working with process variation aware caches
Proceedings of the conference on Design, automation and test in Europe
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Process Variation Tolerant 3T1D-Based Cache Architectures
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Reducing Data TLB Power via Compiler-Directed Address Generation
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Hi-index | 0.00 |
As technology moves towards finer process geometries, it is becoming extremely difficult to control critical physical parameters such as channel length, gate oxide thickness, and dopant ion concentration. Variations in these parameters lead to dramatic variations in access latencies in Static Random Access Memory (SRAM) devices. This means that different lines of the same cache may have different access latencies. A simple solution to this problem is to adopt the worst-case latency paradigm. While this egalitarian cache management is simple, it may introduce significant performance overhead during instruction fetches when both address translation (instruction Translation Lookaside Buffer (TLB) access) and instruction cache access take place, making this solution infeasible for future high-performance processors. In this study, we first propose some hardware and software enhancements and then, based on those, investigate several techniques to mitigate the effect of process variation on the instruction fetch pipeline stage in modern processors. For address translation, we study an approach that performs the virtual-to-physical page translation once, then stores it in a special register, reusing it as long as the execution remains on the same instruction page. To handle varying access latencies across different instruction cache lines, we annotate the cache access latency of instructions within themselves to give the circuitry a hint about how long to wait for the next instruction to become available.