Coherency for multiprocessor virtual address caches
ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
TLB and snoop energy-reduction using virtual caches in low-power chip-multiprocessors
Proceedings of the 2002 international symposium on Low power electronics and design
IEEE Micro
Real-Time Parallel MPEG-2 Decoding in Software
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
JETTY: Filtering Snoops for Reduced Energy Consumption in SMP Servers
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
SimICS/sun4m: a virtual workstation
ATEC '98 Proceedings of the annual conference on USENIX Annual Technical Conference
TLB and snoop energy-reduction using virtual caches in low-power chip-multiprocessors
Proceedings of the 2002 international symposium on Low power electronics and design
Reducing translation lookaside buffer active power
Proceedings of the 2003 international symposium on Low power electronics and design
Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Exploring the energy efficiency of cache coherence protocols in single-chip multi-processors
GLSVLSI '05 Proceedings of the 15th ACM Great Lakes symposium on VLSI
Skewed caches from a low-power perspective
Proceedings of the 2nd conference on Computing frontiers
RegionScout: Exploiting Coarse Grain Sharing in Snoop-Based Coherence
Proceedings of the 32nd annual international symposium on Computer Architecture
Improving Multiprocessor Performance with Coarse-Grain Coherence Tracking
Proceedings of the 32nd annual international symposium on Computer Architecture
SBCCI '05 Proceedings of the 18th annual symposium on Integrated circuits and system design
CODES+ISSS '05 Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
GLSVLSI '06 Proceedings of the 16th ACM Great Lakes symposium on VLSI
Cache coherence tradeoffs in shared-memory MPSoCs
ACM Transactions on Embedded Computing Systems (TECS)
Entropy-based low power data TLB design
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
CODES+ISSS '07 Proceedings of the 5th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Application-aware snoop filtering for low-power cache coherence in embedded multiprocessors
ACM Transactions on Design Automation of Electronic Systems (TODAES)
The impact of wrong-path memory references in cache-coherent multiprocessor systems
Journal of Parallel and Distributed Computing
Dynamic tag reduction for low-power caches in embedded systems with virtual memory
International Journal of Parallel Programming
Improving the accuracy of snoop filtering using stream registers
MEDEA '07 Proceedings of the 2007 workshop on MEmory performance: DEaling with Applications, systems and architecture
Heterogeneously tagged caches for low-power embedded systems with virtual memory support
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Exploiting access semantics and program behavior to reduce snoop power in chip multiprocessors
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Using supplier locality in power-aware interconnects and caches in chip multiprocessors
Journal of Systems Architecture: the EUROMICRO Journal
Proceedings of the 45th annual Design Automation Conference
Energy-efficient MESI cache coherence with pro-active snoop filtering for multicore microprocessors
Proceedings of the 13th international symposium on Low power electronics and design
Direct address translation for virtual memory in energy-efficient embedded systems
ACM Transactions on Embedded Computing Systems (TECS)
Broadcast filtering: Snoop energy reduction in shared bus-based low-power MPSoCs
Journal of Systems Architecture: the EUROMICRO Journal
Low-power inter-core communication through cache partitioning in embedded multiprocessors
Proceedings of the 22nd Annual Symposium on Integrated Circuits and System Design: Chip on the Dunes
Low-power snoop architecture for synchronized producer-consumer embedded multiprocessing
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
B2P2: bounds based procedure placement for instruction TLB power reduction in embedded systems
Proceedings of the 13th International Workshop on Software & Compilers for Embedded Systems
Subspace snooping: filtering snoops with operating system support
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
ACM Transactions on Design Automation of Electronic Systems (TODAES)
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Virtual Snooping: Filtering Snoops in Virtualized Multi-cores
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Exploring the architecture of a stream register-based snoop filter
Transactions on high-performance embedded architectures and compilers III
Increasing the effectiveness of directory caches by deactivating coherence for private memory blocks
Proceedings of the 38th annual international symposium on Computer architecture
Filtering directory lookups in CMPs with write-through caches
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Dynamic, multi-core cache coherence architecture for power-sensitive mobile processors
CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Filtering directory lookups in CMPs
Microprocessors & Microsystems
Using partial tag comparison in low-power snoop-based chip multiprocessors
ISCA'10 Proceedings of the 2010 international conference on Computer Architecture
Reducing memory reference energy with opportunistic virtual caching
Proceedings of the 39th Annual International Symposium on Computer Architecture
Predicting Coherence Communication by Tracking Synchronization Points at Run Time
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Hi-index | 0.00 |
In our quest to bring down the power consumption in low-power chip-multiprocessors, we have found that TLB and snoop accesses account for about 40% of the energy wasted by all L1 data-cache accesses. We have investigated the prospects of using virtual caches to bring down the number of TLB accesses. A key observa驴tion is that while the energy wasted in the TLBs are cut, the energy associated with snoop accesses becomes higher. We then contrib驴ute with two techniques to reduce the number of snoop accesses and their energy cost. Virtual caches together with the proposed techniques are shown to reduce the energy wasted in the L1 caches and the TLBs by about 30%.