An Instruction Fetch Policy Handling L2 Cache Misses in SMT Processors

Authors:
Caixia Sun;Hongwei Tang;Minxuan Zhang
Affiliations:
National University of Defense Technology, China;National University of Defense Technology, China;National University of Defense Technology, China
Venue:
HPCASIA '05 Proceedings of the Eighth International Conference on High-Performance Computing in Asia-Pacific Region
Year:
2005

Citing 8
Cited 2

Simultaneous multithreading: maximizing on-chip parallelism

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Exploiting choice: instruction fetch and issue on an implementable simultaneous multithreading processor

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Speculation techniques for improving load related instruction scheduling

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Handling long-latency loads in a simultaneous multithreading processor

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Simultaneous Multithreading: A Platform for Next-Generation Processors

IEEE Micro
Basic Block Distribution Analysis to Find Periodic Behavior and Simulation Points in Applications

Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Front-End Policies for Improved Issue Efficiency in SMT Processors

HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Dynamically Controlled Resource Allocation in SMT Processors

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture

Fairness and Throughput in Switch on Event Multithreading

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Fairness enforcement in switch on event multithreading

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In Simultaneous Multithreading (SMT) processors, co-scheduled threads share the processor's resources, but at the same time compete for them. A thread missing in L2 cache may occupy most of available resources for a long time, causing other threads run slower than they could or even stall because of lack of resources. As a result, the overall performance of SMT processors is degraded. In this paper, we propose a novel fetch policy called MFP (Multiple Fetch Priorities) to prevent the negative effects caused by L2 cache misses. In our policy, there are three fetch priority levels for each thread and threads are assigned different fetch priority based on their cache behaviors. Each cycle, MFP fetches instructions from the threads with the highest priority. Results show that our policy outperforms previously proposed fetch policies for all types of workloads, especially for memory bounded workloads, whether using IPC as a metric or using the harmonic mean as a metric. Results also tell that our policy shows different degrees of improvement over other fetch policies. The increment over PDG is greatest, reaching 19.2% in throughput and 27.7% in Hmean on average.