Simultaneous multithreading: maximizing on-chip parallelism
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Complexity-effective superscalar processors
Proceedings of the 24th annual international symposium on Computer architecture
A Study of a Simultaneous Multithreaded Processor Implementation
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Out-of-Order Execution may not be Cost-Effective on Processors Featuring Simultaneous Multithreading
HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
Front-End Policies for Improved Issue Efficiency in SMT Processors
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Computer Architecture: A Quantitative Approach
Computer Architecture: A Quantitative Approach
A Low-Complexity, High-Performance Fetch Unit for Simultaneous Multithreading Processors
HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
Methods for Modeling Resource Contention on Simultaneous Multithreading Processors
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Dynamic Helper Threaded Prefetching on the Sun UltraSPARC CMP Processor
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Thread-Sensitive Instruction Issue for SMT Processors
IEEE Computer Architecture Letters
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
A Memory-Level Parallelism Aware Fetch Policy for SMT Processors
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Optimising long-latency-load-aware fetch policies for SMT processors
International Journal of High Performance Computing and Networking
How to enhance a superscalar processor to provide hard real-time capable in-order SMT
ARCS'10 Proceedings of the 23rd international conference on Architecture of Computing Systems
The Journal of Supercomputing
Hi-index | 0.01 |
Simultaneous Multithreading (SMT) technology enhances instruction throughput by issuing multiple instructions from multiple threads within one clock cycle. For in-order pipeline to each thread, SMT processors can provide large number of issued instructions close to or surpass than using out-of-order pipeline. In this work, we show an efficient issue logic for predicated instruction sequence with the parallel flag in each instruction, where the predicate register based issue control is adopted and the continuous instructions with the parallel flag of '0' are executed in parallel. The flag is pre-defined by a compiler. Instructions from different threads are issued based on the round-robin order. We also introduce an Instruction Queue skip mechanism for thread if the queue is empty. Using this kind of issue logic, we designed a 6 threads, 7-stage, in-order pipeline processor. Based on this processor, we compare round-robin issue policy (RR(T1-Tn)) with other policies: thread one always has the highest priority (PR(T1)) and thread one or thread n has the highest priority in turn (PR(T1-Tn)). The results show that RR(T1-Tn) policy outperforms others and PR(T1-Tn) is almost the same to RR(T1-Tn) from the point of view of the issued instructions per cycle.