Stride directed prefetching in scalar processors
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Prefetching using Markov predictors
Proceedings of the 24th annual international symposium on Computer architecture
A performance comparison of contemporary DRAM architectures
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
ACM Computing Surveys (CSUR)
Eager writeback - a technique for improving bandwidth utilization
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Frequent value locality and value-centric data cache design
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Dead-block prediction & dead-block correlating prefetchers
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
L1 data cache decomposition for energy efficiency
ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
Timekeeping in the memory system: predicting and optimizing memory behavior
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Automatically characterizing large scale program behavior
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
A stateless, content-directed data prefetching mechanism
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Asim: A Performance Model Framework
Computer
Energy efficient frequent value data cache design
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Microarchitectural exploration with Liberty
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
TCP: Tag Correlating Prefetchers
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
SMARTS: accelerating microarchitecture simulation via rigorous statistical sampling
Proceedings of the 30th annual international symposium on Computer architecture
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Data Cache Prefetching Using a Global History Buffer
HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
TiNy Threads: A Thread Virtual Machine for the Cyclops64 Cellular Architecture
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 14 - Volume 15
Temporal Streaming of Shared Memory
Proceedings of the 32nd annual international symposium on Computer Architecture
Store-Ordered Streaming of Shared Memory
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Simulation of Computer Architectures: Simulators, Benchmarks, Methodologies, and Recommendations
IEEE Transactions on Computers
Kilo-instruction processors, runahead and prefetching
Proceedings of the 3rd conference on Computing frontiers
Proceedings of the 33rd annual international symposium on Computer Architecture
Data prefetching in a cache hierarchy with high bandwidth and capacity
MEDEA '06 Proceedings of the 2006 workshop on MEmory performance: DEaling with Applications, systems and architectures
A Sampling Method Focusing on Practicality
IEEE Micro
Reducing Cache Pollution via Dynamic Data Prefetch Filtering
IEEE Transactions on Computers
Power-efficient prefetching for embedded processors
ACM Transactions on Embedded Computing Systems (TECS)
Analysis of hardware prefetching across virtual page boundaries
Proceedings of the 4th international conference on Computing frontiers
Data prefetching in a cache hierarchy with high bandwidth and capacity
ACM SIGARCH Computer Architecture News
Data access history cache and associated data prefetching mechanisms
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Wake up and smell the coffee: evaluation methodology for the 21st century
Communications of the ACM - Designing games with a purpose
Low-Cost Adaptive Data Prefetching
Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Stream chaining: exploiting multiple levels of correlation in data prefetching
Proceedings of the 36th annual international symposium on Computer architecture
COMPASS: a programmable data prefetcher using idle GPU shaders
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Sunflower: full-system, embedded, microarchitecture evaluation
HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
Quasi-static voltage scaling for energy minimization with time constraints
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Using runtime activity to dynamically filter out inefficient data prefetches
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
When Prefetching Works, When It Doesn’t, and Why
ACM Transactions on Architecture and Code Optimization (TACO)
HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
Fast and accurate cache modeling in source-level simulation of embedded software
Proceedings of the Conference on Design, Automation and Test in Europe
S/DC: a storage and energy efficient data prefetcher
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Practical models for energy-efficient prefetching in mobile embedded systems
Microprocessors & Microsystems
Hi-index | 0.01 |
While most research papers on computer architectures include some performance measurements, these performance numbers tend to be distrusted. Up to the point that, after so many research articles on data cache architectures, for instance, few researchers have a clear view of what are the best data cache mechanisms. To illustrate the usefulness of a fair quantitative comparison, we have picked a target architecture component for which lots of optimizations have been proposed (data caches), and we have implemented most of the performance-oriented hardware data cache optimizations published in top conferences in the past 4 years. Beyond the comparison of data cache ideas, our goals are twofold: (1) to clearly and quantitatively evaluate the effect of methodology shortcomings, such as model precision, benchmark selection, trace selection..., on assessing and comparing research ideas, and to outline how strong is the methodology effect in many cases, (2) to outline that the lack of interoperable simulators and not disclosing simulators at publication time make it difficult if not impossible to fairly assess the benefit of research ideas. This study is part of a broader effort, called MicroLib, an open library of modular simulators aimed at promoting the disclosure and sharing of simulator models.