Core fusion: accommodating software diversity in chip multiprocessors
Proceedings of the 34th annual international symposium on Computer architecture
Implicitly parallel programming models for thousand-core microprocessors
Proceedings of the 44th annual Design Automation Conference
Federation: repurposing scalar cores for out-of-order instruction issue
Proceedings of the 45th annual Design Automation Conference
Multitasking workload scheduling on flexible core chip multiprocessors
ACM SIGARCH Computer Architecture News
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Multitasking workload scheduling on flexible-core chip multiprocessors
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Strategies for mapping dataflow blocks to distributed hardware
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Boosting single-thread performance in multi-core systems through fine-grain multi-threading
Proceedings of the 36th annual international symposium on Computer architecture
Runtime Reconfiguration of Multiprocessors Based on Compile-Time Analysis
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Supervised learning based power management for multicore processors
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Federation: Boosting per-thread performance of throughput-oriented manycore architectures
ACM Transactions on Architecture and Code Optimization (TACO)
CoreSymphony: an efficient reconfigurable multi-core architecture
ACM SIGARCH Computer Architecture News
Bahurupi: A polymorphic heterogeneous multi-core architecture
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
CRQ-based fair scheduling on composable multicore architectures
Proceedings of the 26th ACM international conference on Supercomputing
Harmony: collection and analysis of parallel block vectors
Proceedings of the 39th Annual International Symposium on Computer Architecture
Interactive physical simulation on multicore architectures
EG PGV'09 Proceedings of the 9th Eurographics conference on Parallel Graphics and Visualization
Survey of Low-Energy Techniques for Instruction Memory Organisations in Embedded Systems
Journal of Signal Processing Systems
DRMA: dynamically reconfigurable MPSoC architecture
Proceedings of the 23rd ACM international conference on Great lakes symposium on VLSI
Journal of Signal Processing Systems
A hyperscalar dual-core architecture for embedded systems
Microprocessors & Microsystems
Hi-index | 0.00 |
Chip multiprocessors with multiple simpler cores are gaining popularity because they have the potential to drive future performance gains without exacerbating the problems of power dissipation and complexity. Current chip multiprocessors increase throughput by utilizing multiple cores to perform computation in parallel. These designs provide real benefits for server-class applications that are explicitly multi-threaded. However, for desktop and other systems where single-thread applications dominate, multicore systems have yet to offer much benefit. Chip multiprocessors are most efficient at executing coarse-grain threads that have little communication. However, general-purpose applications do not provide many opportunities for identifying such threads, due to frequent use of pointers, recursive data structures, if-then-else branches, small function bodies, and loops with small trip counts. To attack this mismatch, this paper proposes a multicore architecture, referred to as Voltron, that extends traditional multicore systems in two ways. First, it provides a dual-mode scalar operand network to enable efficient inter-core communication and lightweight synchro synchronization. Second, Voltron can organize the cores for execution in either coupled or decoupled mode. In coupled mode, the cores execute multiple instruction streams in lock-step to collectively function as a wide-issue VLIW. In decoupled mode, the cores execute a set of fine-grain communicating threads extracted by the compiler. This paper describes the Voltron architecture and associated compiler support for orchestrating bi-modal execution.