Simultaneous multithreading: maximizing on-chip parallelism
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
The case for a single-chip multiprocessor
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Converting thread-level parallelism to instruction-level parallelism via simultaneous multithreading
ACM Transactions on Computer Systems (TOCS)
Computer
Dynamic Partitioning of Shared Cache Memory
The Journal of Supercomputing
A Self-Tuning Cache Architecture for Embedded Systems
Proceedings of the conference on Design, automation and test in Europe - Volume 1
A way-halting cache for low-energy high-performance systems
Proceedings of the 2004 international symposium on Low power electronics and design
Predicting Inter-Thread Cache Contention on a Chip Multi-Processor Architecture
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
A dynamically reconfigurable cache for multithreaded processors
Journal of Embedded Computing - Issues in embedded single-chip multicore architectures
CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
Improving error tolerance for multithreaded register files
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Cache partitioning for energy-efficient and interference-free embedded multitasking
ACM Transactions on Embedded Computing Systems (TECS)
Architectural support for thread communications in multi-core processors
Parallel Computing
Survey of scheduling techniques for addressing shared resources in multicore processors
ACM Computing Surveys (CSUR)
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
Presented in this paper is the thread-associative memory microarchitecture for multicore and multithreaded processor design. Memory contention among concurrent threads in chip multithreaded processing has become a limiting factor for performance improvement. The proposed thread-associative memory addresses this challenge by incorporating thread-specific information explicitly into on-chip memory hardware. The proposed technique can be utilized at different levels of memory hierarchy. Furthermore, it is not just a technique for performance enhancement but also a solution for energy efficiency. Trace-driven simulations on a 32KB L1 data cache demonstrate 36.6% maximum performance improvement and up to 15.1% total energy reduction, with 20.3% dynamic energy reduction and 9.9% leakage energy reduction.