Active messages: a mechanism for integrated communication and computation
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Optimistic active messages: a mechanism for scheduling communication with computation
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Thread scheduling for cache locality
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Practical Pram Programming
Adaptive Load Balancing for MPI Programs
ICCS '01 Proceedings of the International Conference on Computational Science-Part II
A Multithreaded Runtime Environment with Thread Migration for A HPF Data-Parallel Compiler
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Efficient Fine-Grain Thread Migration with Active Threads
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Hardware-modulated parallelism in chip multiprocessors
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Performance implications of single thread migration on a chip multi-core
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Migration in Single Chip Multiprocessors
IEEE Computer Architecture Letters
Carbon: architectural support for fine-grained parallelism on chip multiprocessors
Proceedings of the 34th annual international symposium on Computer architecture
RISC-based moving threads multicore architecture
Proceedings of the 12th International Conference on Computer Systems and Technologies
Hi-index | 0.00 |
Moving threads is a new kind of approach for multicore processor architectures. Traditionally, each thread stays in the core where it is created, and data is moved from the main memory via caches to each core and thread. In the moving threads approach, each core can access only a certain portion of the main memory via its local memory block, and thus extremely lightweight threads are moved between the cores. As a consequence, all kinds of cache coherence problems and need for read reply messages are eliminated. Also Lamport's sequential consistency of shared memory multiprocessor systems is achieved for free. In this paper, we propose a processor architecture (MTPA) for the moving threads paradigm. We describe the overall structure, operation, instruction set, and thread management mechanism as well as evaluate the proposed architecture with different functional unit settings with simulations and give early silicon area and power consumption estimates.