The working set model for program behavior
Communications of the ACM
An anomaly in space-time characteristics of certain programs running in a paging machine
Communications of the ACM
Application of level changing to a multilevel storage organization
Communications of the ACM
One way of estimating frequencies of jumps in a program
Communications of the ACM
Dynamic storage allocation in the Atlas computer, including an automatic use of a backing store
Communications of the ACM
ACM '66 Proceedings of the 1966 21st national conference
Considerations in block-oriented systems design
AFIPS '67 (Spring) Proceedings of the April 18-20, 1967, spring joint computer conference
Experience using a time-shared multi-programming system with dynamic address relocation hardware
AFIPS '67 (Spring) Proceedings of the April 18-20, 1967, spring joint computer conference
AFIPS '68 (Fall, part II) Proceedings of the December 9-11, 1968, fall joint computer conference, part II
Sequentiality and prefetching in database systems
ACM Transactions on Database Systems (TODS)
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Compile-Time Based Performance Prediction
LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
A Short Theory of Multiprogramming
MASCOTS '95 Proceedings of the 3rd International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems
RECET - A Real-Time Cache Evaluation Tool
MASCOTS '95 Proceedings of the 3rd International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems
Trace-Driven Memory Simulation: A Survey
Performance Evaluation: Origins and Directions
Cost-Sensitive Cache Replacement Algorithms
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Evaluation of cache consistency algorithm performance
MASCOTS '96 Proceedings of the 4th International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
A Cost-Effective Main Memory Organization for Future Servers
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Predicting Cache Space Contention in Utility Computing Servers
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 10 - Volume 11
Instruction Based Memory Distance Analysis and its Application
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Multiple Page Size Modeling and Optimization
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Counter-Based Cache Replacement Algorithms
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
A short proof of optimality for the MIN cache replacement algorithm
Information Processing Letters
Predicting locality phases for dynamic memory optimization
Journal of Parallel and Distributed Computing
Dynamic Characteristics of Loops
IEEE Transactions on Computers
Determining Fault Ratios in Multilevel Delayed Staging Storage Hierarchies
IEEE Transactions on Computers
Computation of Cold-Start Miss Ratios
IEEE Transactions on Computers
A Modified Working Set Paging Algorithm
IEEE Transactions on Computers
Prepaging and Applications to Array Algorithms
IEEE Transactions on Computers
On the Paging Performance of Array Algorithms
IEEE Transactions on Computers
Delayed-Staging Hierarchy Optimization
IEEE Transactions on Computers
Two-Level Replacement Decisions in Paging Stores
IEEE Transactions on Computers
Another short proof of optimality for the MIN cache replacement algorithm
Information Processing Letters
Hierarchical memory with block transfer
SFCS '87 Proceedings of the 28th Annual Symposium on Foundations of Computer Science
Scalable Implementation of Efficient Locality Approximation
Languages and Compilers for Parallel Computing
P-OPT: Program-Directed Optimal Cache Management
Languages and Compilers for Parallel Computing
GCSim: A GPU-Based Trace-Driven Simulator for Multi-level Cache
APPT '09 Proceedings of the 8th International Symposium on Advanced Parallel Processing Technologies
Determination of Cache's Capacity and its Matching Storage Hierarchy
IEEE Transactions on Computers
Optimal control of demand-paging systems
Information Sciences: an International Journal
Adaptive execution techniques of parallel programs for multiprocessors
Journal of Parallel and Distributed Computing
Performance of large low-associativity caches
ACM SIGMETRICS Performance Evaluation Review
MLP-aware dynamic cache partitioning
HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
Where replacement algorithms fail: a thorough analysis
Proceedings of the 7th ACM international conference on Computing frontiers
Static reuse distances for locality-based optimizations in MATLAB
Proceedings of the 24th ACM International Conference on Supercomputing
A query language and runtime tool for evaluating behavior of multi-tier servers
Proceedings of the ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Instruction-based reuse-distance prediction for effective cache management
SAMOS'09 Proceedings of the 9th international conference on Systems, architectures, modeling and simulation
Morphable memory system: a robust architecture for exploiting multi-level phase change memories
Proceedings of the 37th annual international symposium on Computer architecture
Proceedings of the 47th Design Automation Conference
I/O Deduplication: Utilizing content similarity to improve I/O performance
ACM Transactions on Storage (TOS)
Accelerating multicore reuse distance analysis with sampling and parallelization
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
I/O deduplication: utilizing content similarity to improve I/O performance
FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
DEW: a fast level 1 cache simulation approach for embedded processors with FIFO replacement policy
Proceedings of the Conference on Design, Automation and Test in Europe
Understanding the behavior and implications of context switch misses
ACM Transactions on Architecture and Code Optimization (TACO)
An efficient simulation algorithm for cache of random replacement policy
NPC'10 Proceedings of the 2010 IFIP international conference on Network and parallel computing
Reducing Cache Pollution Through Detection and Elimination of Non-Temporal Memory Accesses
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Online cache modeling for commodity multicore processors
ACM SIGOPS Operating Systems Review
Towards architecture independent metrics for multicore performance analysis
ACM SIGMETRICS Performance Evaluation Review
All-window profiling and composable models of cache sharing
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Fast modeling of shared caches in multicore systems
Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
T-SPaCS: a two-level single-pass cache simulation methodology
Proceedings of the 16th Asia and South Pacific Design Automation Conference
DeFT: Design space exploration for on-the-fly detection of coherence misses
ACM Transactions on Architecture and Code Optimization (TACO)
ARC: a self-tuning, low overhead replacement cache
FAST'03 Proceedings of the 2nd USENIX conference on File and storage technologies
Dynamic cache partitioning based on the MLP of cache misses
Transactions on high-performance embedded architectures and compilers III
On the theory and potential of LRU-MRU collaborative cache management
Proceedings of the international symposium on Memory management
Modeling program resource demand using inherent program characteristics
Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Studying inter-core data reuse in multicores
Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Branch target buffers: WCET analysis framework and timing predictability
Journal of Systems Architecture: the EUROMICRO Journal
Predictive coordination of multiple on-chip resources for chip multiprocessors
Proceedings of the international conference on Supercomputing
Bypass and insertion algorithms for exclusive last-level caches
Proceedings of the 38th annual international symposium on Computer architecture
Low cost working set size tracking
USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
Modeling program resource demand using inherent program characteristics
ACM SIGMETRICS Performance Evaluation Review - Performance evaluation review
Studying inter-core data reuse in multicores
ACM SIGMETRICS Performance Evaluation Review - Performance evaluation review
Efficient stack distance computation for priority replacement policies
Proceedings of the 8th ACM International Conference on Computing Frontiers
Dynamic access distance driven cache replacement
ACM Transactions on Architecture and Code Optimization (TACO)
Evaluating placement policies for managing capacity sharing in CMP architectures with private caches
ACM Transactions on Architecture and Code Optimization (TACO)
HC-Sim: a fast and exact l1 cache simulator with scratchpad memory co-simulation support
CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
DynaPoMP: dynamic policy-driven memory protection for SPM-based embedded systems
WESS '11 Proceedings of the Workshop on Embedded Systems Security
A study on the locality behavior of minimum spanning tree algorithms
HiPC'06 Proceedings of the 13th international conference on High Performance Computing
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
CIPARSim: cache intersection property assisted rapid single-pass FIFO cache simulation technique
Proceedings of the International Conference on Computer-Aided Design
Design and analysis of adaptive processor
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Phase-Based miss rate prediction across program inputs
LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
Adaptively increasing performance and scalability of automatically parallelized programs
LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
PSnAP: accurate synthetic address streams through memory profiles
LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
Preventing denial-of-service attacks in shared CMP caches
SAMOS'06 Proceedings of the 6th international conference on Embedded Computer Systems: architectures, Modeling, and Simulation
Combining locality analysis with online proactive job co-scheduling in chip multiprocessors
HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
Is reuse distance applicable to data locality analysis on chip multiprocessors?
CC'10/ETAPS'10 Proceedings of the 19th joint European conference on Theory and Practice of Software, international conference on Compiler Construction
Path-Based reuse distance analysis
CC'06 Proceedings of the 15th international conference on Compiler Construction
Pinpointing data locality problems using data-centric analysis
CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
Automated locality optimization based on the reuse distance of string operations
CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
VMMB: Virtual Machine Memory Balancing for Unmodified Operating Systems
Journal of Grid Computing
Toward predictable performance in software packet-processing platforms
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
Brief paper: Stochastic control of paging in a two-level computer memory
Automatica (Journal of IFAC)
A generalized theory of collaborative caching
Proceedings of the 2012 international symposium on Memory Management
Phase guided profiling for fast cache modeling
Proceedings of the Tenth International Symposium on Code Generation and Optimization
Thin-client Web access patterns: Measurements from a cache-busting proxy
Computer Communications
Optimal Web cache sizing: scalable methods for exact solutions
Computer Communications
Locality & utility co-optimization for practical capacity management of shared last level caches
Proceedings of the 26th ACM international conference on Supercomputing
A proof of the optimality of the MIN paging algorithm using linear programming duality
Operations Research Letters
Cache Conscious Task Regrouping on Multicore Processors
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Barely alive memory servers: Keeping data active in a low-power state
ACM Journal on Emerging Technologies in Computing Systems (JETC)
Introducing hierarchy-awareness in replacement and bypass algorithms for last-level caches
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Survey of scheduling techniques for addressing shared resources in multicore processors
ACM Computing Surveys (CSUR)
DaaC: device-reserved memory as an eviction-based file cache
Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systems
Automatically enhancing locality for tree traversals with traversal splicing
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Dynamic global resource allocation in shared data centers and clouds
CASCON '12 Proceedings of the 2012 Conference of the Center for Advanced Studies on Collaborative Research
Understanding fundamental design choices in single-ISA heterogeneous multicore architectures
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Fair CPU time accounting in CMP+SMT processors
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Towards a predictive cache replacement strategy for multimedia content
Journal of Network and Computer Applications
Power-aware resource allocation for CPU-and memory-intense internet services
E2DC'12 Proceedings of the First international conference on Energy Efficient Data Centers
Responding rapidly to service level violations using virtual appliances
ACM SIGOPS Operating Systems Review
Lifetime and QoS-aware energy-saving buffering schemes
Journal of Systems and Software
Detecting latent attack behavior from aggregated Web traffic
Computer Communications
Diagnosis and optimization of application prefetching performance
Proceedings of the 27th international ACM conference on International conference on supercomputing
Reuse-based online models for caches
Proceedings of the ACM SIGMETRICS/international conference on Measurement and modeling of computer systems
Pacman: program-assisted cache management
Proceedings of the 2013 international symposium on memory management
A survey on cache tuning from a power/energy perspective
ACM Computing Surveys (CSUR)
Design and implementation of caching services in the cloud
IBM Journal of Research and Development
An empirical model for predicting cross-core performance interference on multicore processors
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Toward application-specific memory reconfiguration for energy efficiency
E2SC '13 Proceedings of the 1st International Workshop on Energy Efficient Supercomputing
An analytical approach for fast and accurate design space exploration of instruction caches
ACM Transactions on Embedded Computing Systems (TECS)
Imbalanced cache partitioning for balanced data-parallel programs
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Efficient management of last-level caches in graphics processors for 3D scene rendering workloads
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
An equation-based Heap Sizing Rule
Performance Evaluation
Beyond reuse distance analysis: Dynamic analysis for characterization of data locality potential
ACM Transactions on Architecture and Code Optimization (TACO)
Optimal eviction policies for stochastic address traces
Theoretical Computer Science
An early memory hierarchy evaluation simulator for multimedia applications
Microprocessors & Microsystems
Computer performance analysis and the Pi Theorem
Computer Science - Research and Development
Hi-index | 0.03 |
The design of efficient storage hierarchies generally involves the repeated running of "typical" program address traces through a simulated storage system while various hierarchy design parameters are adjusted. This paper describes a new and efficient method of determining, in one pass of an address trace, performance measures for a large class of demand-paged, multilevel storage systems utilizing a variety of mapping schemes and replacement algorithms. The technique depends on an algorithm classification, called "stack algorithms," examples of which are "least frequently used," "least recently used," "optimal," and "random replacement" algorithms. The techniques yield the exact access frequency to each storage device, which can be used to estimate the overall performance of actual storage hierarchies.