MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Computer organization and design (2nd ed.): the hardware/software interface
Computer organization and design (2nd ed.): the hardware/software interface
A dynamic multithreading processor
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Improving the performance of speculatively parallel applications on the Hydra CMP
ICS '99 Proceedings of the 13th international conference on Supercomputing
ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
ACSAC '01 Proceedings of the 6th Australasian conference on Computer systems architecture
Micro-Threading: A New Approach to Future RISC
ACAC '00 Proceedings of the 5th Australasian Computer Architecture Conference
On Dynamic Speculative Thread Partitioning and the MEM-Slicing Algorithm
PACT '99 Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques
The challenges of massive on-chip concurrency
ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture
Hi-index | 0.00 |
The micro-threaded microprocessor is a chip multi-processor, which uses a multi-threaded approach, where the threads are obtained from within a single context and exploit both vector and instruction level parallelism (ILP). This approach employs vertical and horizontal transfer in a simple pipeline. The horizontal transfer is referred to as the normal scalar pipeline processing used in most microprocessors. Vertical transfer is a context switch, which allows the code to tolerate any latency from undetermined data and control dependencies. The performance of the single pipeline is very important in the overall performance of the whole processor, which can distribute threads to any of the available processors. We have measured the influence of three crucial parameters - cache delay, cache miss rate, and number of registers - on the performance using our simulator. Even for a long cache delay (1000 processor cycles) we found that the micro-threaded pipeline can still achieves an IPC of 0.8 in the peak performance which is some 6 times better than a conventional scalar pipeline. If we further degrade cache performance by using an artificially small cache line size the performance of conventional scalar pipeline gives an IPC of 0.02, whereas with unlimited registers the micro-threaded pipeline still manages to achieve and IPC of 0.8 (a factor of 40 difference in performance).