Improving 3D geometry transformations on a simultaneous multithreaded SIMD processor
ICS '01 Proceedings of the 15th international conference on Supercomputing
Asynchrony in parallel computing: from dataflow to multithreading
Progress in computer research
SMT Layout Overhead and Scalability
IEEE Transactions on Parallel and Distributed Systems
Asynchrony in parallel computing: from dataflow to multithreading
Progress in computer research
A survey of processors with explicit multithreading
ACM Computing Surveys (CSUR)
Multithreaded Parallel Computer Model with Performance Evaluation
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
A Study of a Simultaneous Multithreaded Processor Implementation
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Architecture optimization for multimedia application exploiting data and thread-level parallelism
Journal of Systems Architecture: the EUROMICRO Journal
Improving SMT performance scheduling processes
EUROMICRO-PDP'02 Proceedings of the 10th Euromicro conference on Parallel, distributed and network-based processing
Hi-index | 0.00 |
In this study we show that fine-grain multithreading is an effective way to increase instruction-level parallelism and hide the latencies of long-latency operations in a superscalar processor. The effects of long-latency operations, such as remote memory references, cache-misses, and multi-cycle floating-point calculations, are detrimental to performance since such operations typically cause a stall. Even superscalar processors, that are capable of performing various operations in parallel, are vulnerable. A fine-grain multithreading paradigm and unique multithreaded superscalar architecture is presented. Simulation results show significant speedup over single-threaded superscalar execution.