ATUM: a new technique for capturing address traces using microcode
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Mimic: a fast system/370 simulator
SIGPLAN '87 Papers of the Symposium on Interpreters and interpretive techniques
Analysis of cache performance for operating systems and multiprogramming
Analysis of cache performance for operating systems and multiprogramming
Cache performance of operating system and multiprogramming workloads
ACM Transactions on Computer Systems (TOCS)
Accurate Low-Cost Methods for Performance Evaluation of Cache Memory Systems
IEEE Transactions on Computers
Measuring VAX 8800 performance with a histogram hardware monitor
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Multiprocessor cache analysis using ATUM
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
The rice parallel processing testbed
SIGMETRICS '88 Proceedings of the 1988 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Efficient (stack) algorithms for analysis of write-back and sector memories
ACM Transactions on Computer Systems (TOCS)
ACM Transactions on Computer Systems (TOCS)
TRAPEDS: producing traces for multicomputers via execution driven simulation
SIGMETRICS '89 Proceedings of the 1989 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Mache: no-loss trace compaction
SIGMETRICS '89 Proceedings of the 1989 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Evaluating Associativity in CPU Caches
IEEE Transactions on Computers
High-performance computer architecture (2nd ed.)
High-performance computer architecture (2nd ed.)
Abstract execution: a technique for efficiently tracing programs
Software—Practice & Experience
Efficient trace-driven simulation method for cache performance analysis
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Techniques for efficient inline tracing on a shared-memory multiprocessor
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Blocking: exploiting spatial locality for trace compaction
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
The effect of context switches on cache performance
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
A model for estimating trace-sample miss ratios
SIGMETRICS '91 Proceedings of the 1991 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Instruction level profiling and evaluation of the IBM/6000
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
MIPS RISC architectures
i860 microprocessor family programmer's reference manual
i860 microprocessor family programmer's reference manual
MemSpy: analyzing memory system bottlenecks in programs
SIGMETRICS '92/PERFORMANCE '92 Proceedings of the 1992 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Characterizing the caching and synchronization performance of a multiprocessor operating system
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Analysis of multi-megabyte secondary CPU cache memories
Analysis of multi-megabyte secondary CPU cache memories
Design tradeoffs for software-managed TLBs
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
The Wisconsin Wind Tunnel: virtual prototyping of parallel computers
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
The accuracy of trace-driven simulations of multiprocessors
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Effectiveness of trace sampling for performance debugging tools
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
An analysis of the information content of address and data reference streams
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
The impact of operating system structure on memory system performance
SOSP '93 Proceedings of the fourteenth ACM symposium on Operating systems principles
ATOM: a system for building customized program analysis tools
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Shade: a fast instruction-set simulator for execution profiling
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Characterization of alpha AXP performance using TP and SPEC workloads
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Optimal allocation of on-chip memory for multiple-API operating systems
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Analyzing and tuning memory performance in sequential and parallel programs
Analyzing and tuning memory performance in sequential and parallel programs
Multi-configuration simulation algorithms for the evaluation of computer architecture designs
Multi-configuration simulation algorithms for the evaluation of computer architecture designs
Hardware and software support for efficient exception handling
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Trap-driven simulation with Tapeworm II
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Contrasting characteristics and cache performance of technical and multi-user commercial workloads
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Surpassing the TLB performance of superpages with less operating system support
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
The Meerkat multicomputer: tradeoffs in multicomputer architecture
The Meerkat multicomputer: tradeoffs in multicomputer architecture
Talisman: fast and accurate multicomputer simulation
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Active memory: a new abstraction for memory-system simulation
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Instruction fetching: coping with code bloat
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Decoupled hardware support for distributed shared memory
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Embra: fast and flexible machine simulation
Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
The structure and performance of interpreters
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
Generation and analysis of very long address traces
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
The Growth of Interest in Microprogramming: A Literature Survey
ACM Computing Surveys (CSUR)
ACM Computing Surveys (CSUR)
Cache memory performance in a unix enviroment
ACM SIGARCH Computer Architecture News
Bibliography and reading on CPU cache memories and related topics
ACM SIGARCH Computer Architecture News
Translation buffer performance in a UNIX enviroment
ACM SIGARCH Computer Architecture News
MC88100 Microprocessors User's Manual
MC88100 Microprocessors User's Manual
Complete Computer System Simulation: The SimOS Approach
IEEE Parallel & Distributed Technology: Systems & Technology
Computer
A Design for Efficient Simulation of a Multiprocessor
MASCOTS '93 Proceedings of the International Workshop on Modeling, Analysis, and Simulation On Computer and Telecommunication Systems
MINT: A Front End for Efficient Simulation of Shared-Memory Multiprocessors
MASCOTS '94 Proceedings of the Second International Workshop on Modeling, Analysis, and Simulation On Computer and Telecommunication Systems
IDtrace - A Tracing Tool for i486 Simulation
MASCOTS '94 Proceedings of the Second International Workshop on Modeling, Analysis, and Simulation On Computer and Telecommunication Systems
A case study of VAX-11 instruction set usage for compiler execution
ASPLOS I Proceedings of the first international symposium on Architectural support for programming languages and operating systems
A Characterization of Processor Performance in the vax-11/780
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Information content of CPU memory referencing behavior
ISCA '77 Proceedings of the 4th annual symposium on Computer architecture
The accuracy of trace-driven simulations of multiprocessors
The accuracy of trace-driven simulations of multiprocessors
Mable: A Technique for Efficient Machine Simulation
Mable: A Technique for Efficient Machine Simulation
An Address Trace Generator for Trace-Driven Simulation of Shared
An Address Trace Generator for Trace-Driven Simulation of Shared
Analysis of cache replacement-algorithms
Analysis of cache replacement-algorithms
Aspects of cache memory and instruction buffer performance
Aspects of cache memory and instruction buffer performance
Bus and cache memory organizations for multiprocessors
Bus and cache memory organizations for multiprocessors
Architectural trade-offs in a latency-tolerant gallium arsenide microprocessor
Architectural trade-offs in a latency-tolerant gallium arsenide microprocessor
ATOM: a flexible interface for building high performance program analysis tools
TCON'95 Proceedings of the USENIX 1995 Technical Conference Proceedings
Trap-driven memory simulation with Tapeworm II
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Modeling set associative caches behavior for irregular computations
SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Trace reduction for virtual memory simulations
SIGMETRICS '99 Proceedings of the 1999 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Address trace compression through loop detection and reduction
SIGMETRICS '99 Proceedings of the 1999 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
IEEE Transactions on Parallel and Distributed Systems
Analytical Modeling of Set-Associative Cache Behavior
IEEE Transactions on Computers
Analytical cache models with applications to cache partitioning
ICS '01 Proceedings of the 15th international conference on Supercomputing
Cache performance for multimedia applications
ICS '01 Proceedings of the 15th international conference on Supercomputing
ACM Transactions on Database Systems (TODS)
Full-system timing-first simulation
SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
A Methodology for Architecture Exploration of Heterogeneous Signal Processing Systems
Journal of VLSI Signal Processing Systems - Special issue on signal processing systems design and implementation
STEP: a framework for the efficient encoding of general trace data
Proceedings of the 2002 ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering
System level design with spade: an M-JPEG case study
Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
Journal of Computer Science and Technology
Trace Factory: Generating Workloads for Trace-Driven Simulation of Shared-Bus Multiprocessors
IEEE Parallel & Distributed Technology: Systems & Technology
Probabilistic Miss Equations: Evaluating Memory Hierarchy Performance
IEEE Transactions on Computers
Peppermint and Sled: Tools for Evaluating SMP Systems Based on IA-64 (IPF) Processors
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Code Cloning Tracing: A ``Pay per Trace'' Approach
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Set Associative Cache Behavior Optimization
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Boosting the Performance of Three-Tier Web Servers Deploying SMP Architecture
Revised Papers from the NETWORKING 2002 Workshops on Web Engineering and Peer-to-Peer Computing
Trace-Driven Memory Simulation: A Survey
Performance Evaluation: Origins and Directions
SMP system interconnect instrumentation for performance analysis
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Flexible reference trace reduction for VM simulations
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Simulation of computer system architectures
Applied system simulation
A fast and accurate framework to analyze and optimize cache memory behavior
ACM Transactions on Programming Languages and Systems (TOPLAS)
Collecting whole-system reference traces of multiprogrammed and multithreaded workloads
WOSP '04 Proceedings of the 4th international workshop on Software and performance
Efficient and Accurate Analytical Modeling of Whole-Program Data Cache Behavior
IEEE Transactions on Computers
A compiler tool to predict memory hierarchy performance of scientific codes
Parallel Computing
Complete or fast reference trace collection for simulating multiprogrammed workloads: choose one
Proceedings of the joint international conference on Measurement and modeling of computer systems
The performance impact of I/O optimizations and disk improvements
IBM Journal of Research and Development
Characteristics of production database workloads and the TPC benchmarks
IBM Systems Journal - End-to-end security
Methods for evaluating and covering the design space during early design development
Integration, the VLSI Journal
Journal of Parallel and Distributed Computing
Replicating memory behavior for performance prediction
LCR '04 Proceedings of the 7th workshop on Workshop on languages, compilers, and run-time support for scalable systems
Encyclopedia of Computer Science
Memory access optimizations in instruction-set simulators
CODES+ISSS '05 Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Quantifying Locality In The Memory Access Patterns of HPC Applications
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
The automatic improvement of locality in storage systems
ACM Transactions on Computer Systems (TOCS)
Application of full-system simulation in exploratory system design and development
IBM Journal of Research and Development
Performance prediction of paging workloads using lightweight tracing
Future Generation Computer Systems - Systems performance analysis and evaluation
Analytical modeling of codes with arbitrary data-dependent conditional structures
Journal of Systems Architecture: the EUROMICRO Journal
An efficient single-pass trace compression technique utilizing instruction streams
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Design and Implementation of aWorkload Specific Simulator
ANSS '06 Proceedings of the 39th annual Symposium on Simulation
Characteristics of workloads used in high performance and technical computing
Proceedings of the 21st annual international conference on Supercomputing
Precise automatable analytical modeling of the cache behavior of codes with indirections
ACM Transactions on Architecture and Code Optimization (TACO)
Adaptive prefetching algorithm in disk controllers
Performance Evaluation
Speeding-up multiprocessors running DBMS workloads through coherence protocols
International Journal of High Performance Computing and Networking
HMTT: a platform independent full-system memory trace monitoring system
SIGMETRICS '08 Proceedings of the 2008 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
On adaptive replacement based on LRU with working area restriction algorithm
ACM SIGOPS Operating Systems Review
Cache simulator based on GPU acceleration
Proceedings of the 2nd International Conference on Simulation Tools and Techniques
GCSim: A GPU-Based Trace-Driven Simulator for Multi-level Cache
APPT '09 Proceedings of the 8th International Symposium on Advanced Parallel Processing Technologies
Characterizing and Understanding the Bandwidth Behavior of Workloads on Multi-core Processors
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
A new trace-driven shared-memory multiprocessors machine simulator
International Journal of Computers and Applications
Considering the frequency dimension into on demand adaptive algorithms
ACM SIGOPS Operating Systems Review
Software—Practice & Experience
Cache behavior modelling for codes involving banded matrices
LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
Exploiting stability to reduce time-space cost for memory tracing
ICCS'03 Proceedings of the 2003 international conference on Computational science: PartIII
Performance of large low-associativity caches
ACM SIGMETRICS Performance Evaluation Review
Communication architecture simulation on the virtual synchronization framework
SAMOS'07 Proceedings of the 7th international conference on Embedded computer systems: architectures, modeling, and simulation
Dependability metrics
A Simulation Framework for Rapid Analysis of Reconfigurable Computing Systems
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Using GPU to accelerate a pin-based multi-level cache simulator
SpringSim '10 Proceedings of the 2010 Spring Simulation Multiconference
Improved procedure placement for set associative caches
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Self-similarity in SPLASH-2 workloads on shared memory multiprocessors systems
EURO-PDP'00 Proceedings of the 8th Euromicro conference on Parallel and distributed processing
Dynamically reconfigurable cache architecture using adaptive block allocation policy
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Coarse-grained simulation method for performance evaluation of a shared memory system
Proceedings of the 16th Asia and South Pacific Design Automation Conference
FILESPPA: Fast Instruction Level Embedded System Power and Performance Analyzer
Microprocessors & Microsystems
Using platform-specific performance counters for dynamic compilation
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
On the simulation of large-scale architectures using multiple application abstraction levels
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Discovery of locality-improving refactorings by reuse path analysis
HPCC'06 Proceedings of the Second international conference on High Performance Computing and Communications
Trace-driven simulation of memory system scheduling in multithread application
Proceedings of the 2012 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness
Interactive visualization for memory reference traces
EuroVis'08 Proceedings of the 10th Joint Eurographics / IEEE - VGTC conference on Visualization
Accurately modeling superscalar processor performance with reduced trace
Journal of Parallel and Distributed Computing
Elephant tracks: portable production of complete and precise gc traces
Proceedings of the 2013 international symposium on memory management
A survey on cache tuning from a power/energy perspective
ACM Computing Surveys (CSUR)
An analytical approach for fast and accurate design space exploration of instruction caches
ACM Transactions on Embedded Computing Systems (TECS)
HMTT: A hybrid hardware/software tracing system for bridging the DRAM access trace's semantic gap
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.01 |
As the gap between processor and memory speeds continues to widen, methods for evaluating memory system designs before they are implemented in hardware are becoming increasingly important. One such method, trace-driven memory simulation, has been the subject of intense interest among researchers and has, as a result, enjoyed rapid development and substantial improvements during the past decade. This article surveys and analyzes these developments by establishing criteria for evaluating trace-driven methods, and then applies these criteria to describe, categorize, and compare over 50 trace-driven simulation tools. We discuss the strengths and weaknesses of different approaches and show that no single method is best when all criteria, including accuracy, speed, memory, flexibility, portability, expense, and ease of use are considered. In a concluding section, we examine fundamental limitations to trace-driven simulation, and survey some recent developments in memory simulation that may overcome these bottlenecks.