Supporting dynamic data structures on distributed-memory machines
ACM Transactions on Programming Languages and Systems (TOPLAS)
Execution-based prediction using speculative slices
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Data prefetching by dependence graph precomputation
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Detailed design and evaluation of redundant multithreading alternatives
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Design and evaluation of compiler algorithms for pre-execution
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Physical Experimentation with Prefetching Helper Threads on Intel's Hyper-Threaded Processors
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Evaluation of the Raw Microprocessor: An Exposed-Wire-Delay Architecture for ILP and Streams
Proceedings of the 31st annual international symposium on Computer architecture
A study of source-level compiler algorithms for automatic construction of pre-execution code
ACM Transactions on Computer Systems (TOCS)
Conjoined-Core Chip Multiprocessing
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Dynamic Helper Threaded Prefetching on the Sun UltraSPARC CMP Processor
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
GPGPU: general purpose computation on graphics hardware
ACM SIGGRAPH 2004 Course Notes
Raksha: a flexible information flow architecture for software security
Proceedings of the 34th annual international symposium on Computer architecture
Prefetching with Helper Threads for Loosely Coupled Multiprocessor Systems
IEEE Transactions on Parallel and Distributed Systems
Proceedings of the 7th international conference on Autonomic computing
Dynamic knobs for responsive power-aware computing
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Self-aware computing in the Angstrom processor
Proceedings of the 49th Annual Design Automation Conference
The autonomic operating system research project: achievements and future directions
Proceedings of the 50th Annual Design Automation Conference
Hi-index | 0.00 |
As the push for parallelism continues to increase the number of cores on a chip, system design has become incredibly complex; optimizing for performance and power efficiency is now nearly impossible for the application programmer. To assist the programmer, a variety of techniques for optimizing performance and power at runtime have been developed, but many employ the use of speculative threads or performance counters. These approaches result in stolen cycles, or the use of an extra core, and such expensive penalties can greatly reduce the potential gains. At the same time that general purpose processors have grown larger and more complex, technologies for smaller embedded processors have pushed towards energy efficiency. In this paper, we combine the two and introduce the concept of Partner Cores: low-area, low-power cores paired with larger, faster compute cores. A partner core is tightly coupled to each main processing core, allowing it to perform various optimizations and functions that are impossible on a traditional chip multiprocessor. This paper demonstrates that optimization code running on a partner core can increase performance and provide a net improvement in power efficiency.