The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Scaling and Charact rizing Database Workloads: Bridging the Gap between Research and Practice
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Theoretical and practical limits of dynamic voltage scaling
Proceedings of the 41st annual Design Automation Conference
Single-vDD and single-vT super-drowsy techniques for low-leakage high-performance instruction caches
Proceedings of the 2004 international symposium on Low power electronics and design
Characterizing and modeling minimum energy operation for subthreshold circuits
Proceedings of the 2004 international symposium on Low power electronics and design
Analysis and mitigation of variability in subthreshold design
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
The limit of dynamic voltage scaling and insomniac dynamic voltage scaling
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
The M5 Simulator: Modeling Networked Systems
IEEE Micro
A low-power parallel design of discrete wavelet transform using subthreshold voltage technology
CASES '08 Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems
Reconfigurable energy efficient near threshold cache architectures
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Nanometer MOSFET effects on the minimum-energy point of 45nm subthreshold logic
Proceedings of the 14th ACM/IEEE international symposium on Low power electronics and design
Reconfigurable Multicore Server Processors for Low Power Operation
SAMOS '09 Proceedings of the 9th International Workshop on Embedded Computer Systems: Architectures, Modeling, and Simulation
Reducing leakage power with BTB access prediction
Integration, the VLSI Journal
The BubbleWrap many-core: popping cores for sequential acceleration
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Energy-efficient subthreshold processor design
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Diet SODA: a power-efficient processor for digital cameras
Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design
Automatic synthesis of near-threshold circuits with fine-grained performance tunability
Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design
HotPar'09 Proceedings of the First USENIX conference on Hot topics in parallelism
ACM Transactions on Design Automation of Electronic Systems (TODAES)
A limits study of benefits from nanostore-based future data-centric system architectures
Proceedings of the 9th conference on Computing Frontiers
Process variation in near-threshold wide SIMD architectures
Proceedings of the 49th Annual Design Automation Conference
A shared-FPU architecture for ultra-low power MPSoCs
Proceedings of the ACM International Conference on Computing Frontiers
Improving platform energy: chip area trade-off in near-threshold computing environment
Proceedings of the International Conference on Computer-Aided Design
Hi-index | 0.00 |
Subthreshold circuit design has become a popular approach for building energy efficient digital circuits. One drawback is performance degradation due to the exponentially reduced driving current. This had limited subthreshold circuits to relatively low performance applications such as sensor networks. To retain the excellent energy efficiency while reducing performance loss, we propose to apply subthreshold and near-threshold techniques to chip multi-processors. We show that an architecture where several slower cores are clustered together with a shared faster L1 cache is optimal for energy efficiency, because processor cores and memory operate best at different supply and threshold voltages. In particular, SPLASH2 benchmarks show about a 53% energy improvement over the traditional CMP approach (about 70% over a single core machine).