Active pages: a computation model for intelligent memory
Proceedings of the 25th annual international symposium on Computer architecture
A dynamic multithreading processor
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Evaluation of a high performance code compression method
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Exploiting ILP in page-based intelligent memory
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Table size reduction for data value predictors by exploiting narrow width values
Proceedings of the 14th international conference on Supercomputing
HLS: combining statistical and symbolic simulation to guide microprocessor designs
Proceedings of the 27th annual international symposium on Computer architecture
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
Optimizing software performance for IP frame reassembly in an integrated architecture
Proceedings of the 2nd international workshop on Software and performance
FLASH vs. (simulated) FLASH: closing the simulation loop
ACM SIGPLAN Notices
Retargetable compiled simulation of embedded processors using a machine description language
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Effective algorithms for cache-level compression
GLSVLSI '01 Proceedings of the 11th Great Lakes symposium on VLSI
Low power techniques for address encoding and memory allocation
Proceedings of the 2001 Asia and South Pacific Design Automation Conference
Towards effective embedded processors in codesigns: customizable partitioned caches
Proceedings of the ninth international symposium on Hardware/software codesign
Towards a first vertical prototyping of an extremely fine-grained parallel programming approach
Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
FLASH vs. (Simulated) FLASH: closing the simulation loop
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Profile guided selection of ARM and thumb instructions
Proceedings of the joint conference on Languages, compilers and tools for embedded systems: software and compilers for embedded systems
Using the Alfa-1 simulated processor for educational purposes
Journal on Educational Resources in Computing (JERIC)
Timekeeping in the memory system: predicting and optimizing memory behavior
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
A microprocessor survey course for learning advanced computer architecture
SIGCSE '02 Proceedings of the 33rd SIGCSE technical symposium on Computer science education
Experiment-based project in undergraduate computer architecture
SIGCSE '02 Proceedings of the 33rd SIGCSE technical symposium on Computer science education
More enhancements of the simplescalar tool set
ACM SIGARCH Computer Architecture News
The predictability of load address
ACM SIGARCH Computer Architecture News
Sentry tag: an efficient filter scheme for low power cache
CRPIT '02 Proceedings of the seventh Asia-Pacific conference on Computer systems architecture
Contents provider-assisted dynamic voltage scaling for low energy multimedia applications
Proceedings of the 2002 international symposium on Low power electronics and design
Managing leakage for transient data: decay and quasi-static 4T memory cells
Proceedings of the 2002 international symposium on Low power electronics and design
Saving energy with just in time instruction delivery
Proceedings of the 2002 international symposium on Low power electronics and design
A preactivating mechanism for a VT-CMOS cache using address prediction
Proceedings of the 2002 international symposium on Low power electronics and design
Neural methods for dynamic branch prediction
ACM Transactions on Computer Systems (TOCS)
Bit section instruction set extension of ARM for embedded applications
CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
On-chip decoupling capacitor optimization using architectural level prediction
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Design and evaluation of compiler algorithms for pre-execution
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Control-Flow Speculation through Value Prediction
IEEE Transactions on Computers
POEMS: End-to-End Performance Design of Large Parallel Adaptive Computational Systems
IEEE Transactions on Software Engineering
Experiences in modeling and simulation of computer architectures in DEVS
Transactions of the Society for Computer Simulation International - Recent advances in DEVS methodology--part II
HLSpower: Hybrid Statistical Modeling of the Superscalar Power-Performance Design Space
HiPC '02 Proceedings of the 9th International Conference on High Performance Computing
Influence of Compiler Optimizations on Value Prediction
HPCN Europe 2001 Proceedings of the 9th International Conference on High-Performance Computing and Networking
Architectural Support for Data-intensive Applications
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Ramp Up/Down Functional Unit to Reduce Step Power
PACS '00 Proceedings of the First International Workshop on Power-Aware Computer Systems-Revised Papers
A Comparison of Two Architectural Power Models
PACS '00 Proceedings of the First International Workshop on Power-Aware Computer Systems-Revised Papers
Compiler-Directed Dynamic Frequency and Voltage Scheduling
PACS '00 Proceedings of the First International Workshop on Power-Aware Computer Systems-Revised Papers
TEM2P2EST: A Thermal Enabled Multi-model Power/Performance ESTimator
PACS '00 Proceedings of the First International Workshop on Power-Aware Computer Systems-Revised Papers
Workload Design: Selecting Representative Program-Input Pairs
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Using the Compiler to Improve Cache Replacement Decisions
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
The Case for Speculative Multithreading on SMT Processors
ISHPC '00 Proceedings of the Third International Symposium on High Performance Computing
Low-Cost Value Predictors Using Frequent Value Locality
ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
Limits and Graph Structure of Available Instruction-Level Parallelism (Research Note)
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Decoupling Recovery Mechanism for Data Speculation from Dynamic Instruction Scheduling Structure
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Reordering Memory Bus Transactions for Reduced Power Consumption
LCTES '00 Proceedings of the ACM SIGPLAN Workshop on Languages, Compilers, and Tools for Embedded Systems
Reducing Energy Consumption via Low-Cost Value Prediction
PATMOS '02 Proceedings of the 12th International Workshop on Integrated Circuit Design. Power and Timing Modeling, Optimization and Simulation
Data Compression Transformations for Dynamically Allocated Data Structures
CC '02 Proceedings of the 11th International Conference on Compiler Construction
On Availability of Bit-Narrow Operations in General-Purpose Applications
FPL '00 Proceedings of the The Roadmap to Reconfigurable Computing, 10th International Workshop on Field-Programmable Logic and Applications
Execution Latency Reduction via Variable Latency Pipeline and Instruction Reuse
Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
Dynamic hardware/software partitioning: a first approach
Proceedings of the 40th annual Design Automation Conference
Optimizing memory accesses for spatial computation
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Modeling methodology for integrated simulation of embedded systems
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Predicting the impact of optimizations for embedded systems
Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
Enhancing the performance of 16-bit code using augmenting instructions
Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
A compiler approach for reducing data cache energy
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Enhancing memory level parallelism via recovery-free value prediction
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Challenges for architectural level power modeling
Power aware computing
Control Techniques to Eliminate Voltage Emergencies in High Performance Processors
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
TCP: Tag Correlating Prefetchers
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Sleipnir-An Instruction-Level Simulator Generator
ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
A Study of Channeled DRAM Memory Architectures
ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
Reducing Cost and Tolerating Defects in Page-based Intelligent Memory
ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
Instruction Prediction for Step Power Reduction
ISQED '01 Proceedings of the 2nd International Symposium on Quality Electronic Design
Partial Resolution in Data Value Predictors
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Executable JVM model for analytical reasoning: a study
Proceedings of the 2003 workshop on Interpreters, virtual machines and emulators
Temperature-aware microarchitecture
Proceedings of the 30th annual international symposium on Computer architecture
Pipeline damping: a microarchitectural technique to reduce inductive noise in supply voltage
Proceedings of the 30th annual international symposium on Computer architecture
Detecting global stride locality in value streams
Proceedings of the 30th annual international symposium on Computer architecture
ESTIMA: an architectural-level power estimator for multi-ported pipelined register files
Proceedings of the 2003 international symposium on Low power electronics and design
A trace-level value predictor for Contrail processors
ACM SIGARCH Computer Architecture News
Cache Coherence in Intelligent Memory Systems
IEEE Transactions on Computers
Tiling, Block Data Layout, and Memory Hierarchy Performance
IEEE Transactions on Parallel and Distributed Systems
Considering processing cost in network simulations
MoMeTools '03 Proceedings of the ACM SIGCOMM workshop on Models, methods and tools for reproducible network research
Reducing code size with echo instructions
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Performance, energy, and reliability tradeoffs in replicating hot cache lines
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Comparing Program Phase Detection Techniques
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Using Interaction Costs for Microarchitectural Bottleneck Analysis
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Distance Associativity for High-Performance Energy-Efficient Non-Uniform Cache Architectures
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Experimental Evaluation of Code Properties for WCET Analysis
RTSS '03 Proceedings of the 24th IEEE International Real-Time Systems Symposium
Multiple-Resource Periodic Scheduling Problem: how much fairness is necessary?
RTSS '03 Proceedings of the 24th IEEE International Real-Time Systems Symposium
Design and analysis of low-power cache using two-level filter scheme
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Energy-efficient issue queue design
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on low power
Modeling technology impact on cluster microprocessor performance
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on low power
Power-Aware Branch Prediction: Characterization and Design
IEEE Transactions on Computers
IEEE Transactions on Computers
Constructive timing violation for improving energy efficiency
Compilers and operating systems for low power
A Low Power Strategy for Future Mobile Terminals
Proceedings of the conference on Design, automation and test in Europe - Volume 1
Data Reuse Analysis Technique for Software-Controlled Memory Hierarchies
Proceedings of the conference on Design, automation and test in Europe - Volume 1
Low Static-Power Frequent-Value Data Caches
Proceedings of the conference on Design, automation and test in Europe - Volume 1
Hybrid Architectural Dynamic Thermal Management
Proceedings of the conference on Design, automation and test in Europe - Volume 1
Value-Conscious Cache: Simple Technique for Reducing Cache Access Power
Proceedings of the conference on Design, automation and test in Europe - Volume 1
A Configurable Logic Architecture for Dynamic Hardware/Software Partitioning
Proceedings of the conference on Design, automation and test in Europe - Volume 1
A Crosstalk Aware Interconnect with Variable Cycle Transmission
Proceedings of the conference on Design, automation and test in Europe - Volume 1
Adaptive Prefetching for Multimedia Applications in Embedded Systems
Proceedings of the conference on Design, automation and test in Europe - Volume 2
Instruction Scheduling for Low Power
Journal of VLSI Signal Processing Systems
IEEE Transactions on Computers
Selective stack prefetch method
CompSysTech '03 Proceedings of the 4th international conference conference on Computer systems and technologies: e-Learning
Temperature-aware microarchitecture: Modeling and implementation
ACM Transactions on Architecture and Code Optimization (TACO)
ARCS: an architectural level communication driven simulator
Proceedings of the 14th ACM Great Lakes symposium on VLSI
NWSLite: a light-weight prediction utility for mobile devices
Proceedings of the 2nd international conference on Mobile systems, applications, and services
Floorplanning optimization with trajectory piecewise-linear model for pipelined interconnects
Proceedings of the 41st annual Design Automation Conference
Implementing branch-predictor decay using quasi-static memory cells
ACM Transactions on Architecture and Code Optimization (TACO)
SEPAS: a highly accurate energy-efficient branch predictor
Proceedings of the 2004 international symposium on Low power electronics and design
Design and validation of a performance and power simulator for PowerPC systems
IBM Journal of Research and Development
A static and dynamic energy reduction technique for I-cache and BTB in embedded processors
Proceedings of the 2004 Asia and South Pacific Design Automation Conference
Alloyed branch history: combining global and local branch history for robust performance
International Journal of Parallel Programming
DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
On-chip Stack Based Memory Organization for Low Power Embedded Architectures
DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
A leakage-energy-reduction technique for highly-associative caches in embedded systems
MEDEA '03 Proceedings of the 2003 workshop on MEmory performance: DEaling with Applications , systems and architecture
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
HIDE: an infrastructure for efficiently protecting information leakage on the address bus
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Zero-aware asymmetric SRAM cell for reducing cache power in writing zero
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Simultaneous Partitioning and Frequency Assignment for On-Chip Bus Architectures
Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Increasing Register File Immunity to Transient Errors
Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
A Geometric Programming Framework for Optimal Multi-Level Tiling
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Load elimination for low-power embedded processors
GLSVLSI '05 Proceedings of the 15th ACM Great Lakes symposium on VLSI
Power-Performance Simulation and Design Strategies for Single-Chip Heterogeneous Multiprocessors
IEEE Transactions on Computers
Accelerated warmup for sampled microarchitecture simulation
ACM Transactions on Architecture and Code Optimization (TACO)
On the energy-efficiency of speculative hardware
Proceedings of the 2nd conference on Computing frontiers
Exploiting temporal locality in drowsy cache policies
Proceedings of the 2nd conference on Computing frontiers
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Proceedings of the 42nd annual Design Automation Conference
Programmer specified pointer independence
MSP '04 Proceedings of the 2004 workshop on Memory system performance
Enhancing Memory-Level Parallelism via Recovery-Free Value Prediction
IEEE Transactions on Computers
Coordinated, distributed, formal energy management of chip multiprocessors
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
A multinomial clustering model for fast simulation of computer architecture designs
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Compilation techniques for energy reduction in horizontally partitioned cache architectures
Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems
Reducing latencies of pipelined cache accesses through set prediction
Proceedings of the 19th annual international conference on Supercomputing
Improved automatic testcase synthesis for performance model validation
Proceedings of the 19th annual international conference on Supercomputing
The instruction register file micro-architecture
Future Generation Computer Systems - Special issue: Parallel computing technologies
Dual-Core Execution: Building a Highly Scalable Single-Thread Instruction Window
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Restrictive Compression Techniques to Increase Level 1 Cache Capacity
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Improving memory system performance with energy-efficient value speculation
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Dynamic Resizing of Superscalar Datapath Components for Energy Efficiency
IEEE Transactions on Computers
Journal of Systems Architecture: the EUROMICRO Journal
Performance characteristics of MAUI: an intelligent memory system architecture
Proceedings of the 2005 workshop on Memory system performance
Hardware/software managed scratchpad memory for embedded system
Proceedings of the 2004 IEEE/ACM International conference on Computer-aided design
Prefetching-aware cache line turnoff for saving leakage energy
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
A novel instruction scratchpad memory optimization method based on concomitance metric
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
Lazy BTB: reduce BTB energy consumption using dynamic profiling
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
Studying interactions between prefetching and cache line turnoff
Proceedings of the 2005 Asia and South Pacific Design Automation Conference
Using loop invariants to fight soft errors in data caches
Proceedings of the 2005 Asia and South Pacific Design Automation Conference
An instruction for direct interpretation of LZ77-compressed programs
Software—Practice & Experience
Optimal partitioned fault-tolerant bus layout for reducing power in nanometer designs
Proceedings of the 2006 international symposium on Physical design
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Automatic insertion of low power annotations in RTL for pipelined microprocessors
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Parallel co-simulation using virtual synchronization with redundant host execution
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Decomposing memory performance: data structures and phases
Proceedings of the 5th international symposium on Memory management
Fast and efficient partial code reordering: taking advantage of dynamic recompilatior
Proceedings of the 5th international symposium on Memory management
In search of near-optimal optimization phase orderings
Proceedings of the 2006 ACM SIGPLAN/SIGBED conference on Language, compilers, and tool support for embedded systems
Proceedings of the 33rd annual international symposium on Computer Architecture
Balanced Cache: Reducing Conflict Misses of Direct-Mapped Caches
Proceedings of the 33rd annual international symposium on Computer Architecture
Learning-Based SMT Processor Resource Distribution via Hill-Climbing
Proceedings of the 33rd annual international symposium on Computer Architecture
Reducing Rename Logic Complexity for High-Speed and Low-Power Front-End Architectures
IEEE Transactions on Computers
Performance Models for Network Processor Design
IEEE Transactions on Parallel and Distributed Systems
Branchless cycle prediction for embedded processors
Proceedings of the 2006 ACM symposium on Applied computing
Pattern-driven prefetching for multimedia applications on embedded processors
Journal of Systems Architecture: the EUROMICRO Journal
Dynamic feature selection for hardware prediction
Journal of Systems Architecture: the EUROMICRO Journal
Proceedings of the 41st annual Design Automation Conference
Application of full-system simulation in exploratory system design and development
IBM Journal of Research and Development
IMPRES: integrated monitoring for processor reliability and security
Proceedings of the 43rd annual Design Automation Conference
Systematic temperature sensor allocation and placement for microprocessors
Proceedings of the 43rd annual Design Automation Conference
Performance prediction of paging workloads using lightweight tracing
Future Generation Computer Systems - Systems performance analysis and evaluation
CISL: a class-based machine description language for co-generation of compilers and simulators
International Journal of Parallel Programming - Special issue: The next generation software program
Self-checking instructions: reducing instruction redundancy for concurrent error detection
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Power-efficient instruction delivery through trace reuse
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Investigating cache energy and latency break-even points in high performance processors
MEDEA '06 Proceedings of the 2006 workshop on MEmory performance: DEaling with Applications, systems and architectures
Estimating critical region parallelism to guide platform retargeting
Proceedings of the 43rd annual Southeast regional conference - Volume 1
Javana: a system for building customized Java program analysis tools
Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
Adaptive and flexible dictionary code compression for embedded applications
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Power efficient branch prediction through early identification of branch addresses
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Mitigating soft error failures for multimedia applications by selective data protection
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Yet shorter warmup by combining no-state-loss and MRRL for sampled LRU cache simulation
Journal of Systems and Software - Special issue: Quality software
The exigency of benchmark and compiler drift: designing tomorrow's processors with yesterday's tools
Proceedings of the 20th annual international conference on Supercomputing
Evaluating trace cache energy efficiency
ACM Transactions on Architecture and Code Optimization (TACO)
Architectures and APIs: assessing requirements for delivering FPGA performance to applications
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
An efficient single-pass trace compression technique utilizing instruction streams
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Executable JVM model for analytical reasoning: a study
Science of Computer Programming - Special issue on advances in interpreters, virtual machines and emulators (IVME'03)
VISTA: VPO interactive system for tuning applications
ACM Transactions on Embedded Computing Systems (TECS)
Cache-Friendly implementations of transitive closure
Journal of Experimental Algorithmics (JEA)
Iterative compilation for energy reduction
Journal of Embedded Computing - Cache exploitation in embedded systems
Hybrid multi-core architecture for boosting single-threaded performance
ACM SIGARCH Computer Architecture News
Efficient code size reduction without performance loss
Proceedings of the 2007 ACM symposium on Applied computing
Reconfigurable split data caches: a novel scheme for embedded systems
Proceedings of the 2007 ACM symposium on Applied computing
Exploiting program phase behavior for energy reduction on multi-configuration processors
Journal of Systems Architecture: the EUROMICRO Journal
Code reordering on limited branch offset
ACM Transactions on Architecture and Code Optimization (TACO)
Evaluating Heuristic Optimization Phase Order Search Algorithms
Proceedings of the International Symposium on Code Generation and Optimization
Low-power warp processor for power efficient high-performance embedded systems
Proceedings of the conference on Design, automation and test in Europe
Resource prediction for media stream decoding
Proceedings of the conference on Design, automation and test in Europe
Register pointer architecture for efficient embedded processors
Proceedings of the conference on Design, automation and test in Europe
Microarchitecture floorplanning for sub-threshold leakage reduction
Proceedings of the conference on Design, automation and test in Europe
Increasing cache capacity through word filtering
Proceedings of the 21st annual international conference on Supercomputing
SATSim: a superscalar architecture trace simulator using interactive animation
WCAE '00 Proceedings of the 2000 workshop on Computer architecture education
WCAE '06 Proceedings of the 2006 workshop on Computer architecture education: held in conjunction with the 33rd International Symposium on Computer Architecture
Understanding cache hierarchy interactions with a program-driven simulator
WCAE '07 Proceedings of the 2007 workshop on Computer architecture education
An analysis of timing violations due to spatially distributed thermal effects in global wires
Proceedings of the 44th annual Design Automation Conference
Automatic cache tuning for energy-efficiency using local regression modeling
Proceedings of the 44th annual Design Automation Conference
Phase-aware adaptive hardware selection for power-efficient scientific computations
ISLPED '07 Proceedings of the 2007 international symposium on Low power electronics and design
Thread warping: a framework for dynamic synthesis of thread accelerators
CODES+ISSS '07 Proceedings of the 5th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Interconnect lifetime prediction for reliability-aware systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Investigating cache energy and latency break-even points in high performance processors
ACM SIGARCH Computer Architecture News
Hiding the misprediction penalty of a resource-efficient high-performance processor
ACM Transactions on Architecture and Code Optimization (TACO)
Journal of Parallel and Distributed Computing
Code compression for performance enhancement of variable-length embedded processors
ACM Transactions on Embedded Computing Systems (TECS)
A compiler-in-the-loop framework to explore horizontally partitioned cache architectures
Proceedings of the 2008 Asia and South Pacific Design Automation Conference
Optimal pipeline depth with pipeline stage unification adoption
ACM SIGARCH Computer Architecture News - Special issue: ALPS '07---advanced low power systems
Preventing timing errors on register writes: mechanisms of detections and recoveries
ACM SIGARCH Computer Architecture News - Special issue: ALPS '07---advanced low power systems
Communications of the ACM - Web science
A SMT-ARM simulator and performance evaluation
SEPADS'06 Proceedings of the 5th WSEAS International Conference on Software Engineering, Parallel and Distributed Systems
Proceedings of the 18th ACM Great Lakes symposium on VLSI
A low-power phase change memory based hybrid cache architecture
Proceedings of the 18th ACM Great Lakes symposium on VLSI
Low power microarchitecture with instruction reuse
Proceedings of the 5th conference on Computing frontiers
Profiling of symmetric-encryption algorithms for a novel biomedical-implant architecture
Proceedings of the 5th conference on Computing frontiers
Addressing thermal nonuniformity in SMT workloads
ACM Transactions on Architecture and Code Optimization (TACO)
Proceedings of the 45th annual Design Automation Conference
Automated hardware-independent scenario identification
Proceedings of the 45th annual Design Automation Conference
ICESS '07 Proceedings of the 3rd international conference on Embedded Software and Systems
Reducing register pressure in SMT processors through L2-miss-driven early register release
ACM Transactions on Architecture and Code Optimization (TACO)
Thrifty BTB: A comprehensive solution for dynamic power reduction in branch target buffers
Microprocessors & Microsystems
Mitigating the impact of hardware defects on multimedia applications: a cross-layer approach
MM '08 Proceedings of the 16th ACM international conference on Multimedia
SimNP: a flexible platform for the simulation of a network processing system
Proceedings of the 4th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
Hill-climbing SMT processor resource distribution
ACM Transactions on Computer Systems (TOCS)
Finding Stress Patterns in Microprocessor Workloads
HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
Design and implementation of a MicroBlaze-based warp processor
ACM Transactions on Embedded Computing Systems (TECS)
Practical exhaustive optimization phase order exploration and evaluation
ACM Transactions on Architecture and Code Optimization (TACO)
Scalability and parallel execution of warp processing: dynamic hardware/software partitioning
International Journal of Parallel Programming
Hybrid analytical modeling of pending cache hits, data prefetching, and MSHRs
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
A Dynamic Control Mechanism for Pipeline Stage Unification by Identifying Program Phases
IEICE - Transactions on Information and Systems
A mechanistic performance model for superscalar out-of-order processors
ACM Transactions on Computer Systems (TOCS)
Compiler-Assisted Memory Encryption for Embedded Processors
Transactions on High-Performance Embedded Architectures and Compilers II
Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
Two new techniques integrated for energy-efficient TLB design
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Dynamic performance tuning for speculative threads
Proceedings of the 36th annual international symposium on Computer architecture
Energy-performance Exploration of a CGA-based SDR Processor
Journal of Signal Processing Systems
Instruction-Level Fault Tolerance Configurability
Journal of Signal Processing Systems
Checkpoint allocation and release
ACM Transactions on Architecture and Code Optimization (TACO)
A Generic Instruction Set Simulator API for Timed and Untimed Simulation and Debug of MP2-SoCs
RSP '09 Proceedings of the 2009 IEEE/IFIP International Symposium on Rapid System Prototyping
A methodology for tuning two-level cache hierarchy considering energy and performance
Proceedings of the 22nd Annual Symposium on Integrated Circuits and System Design: Chip on the Dunes
The Impact of Resource Sharing Control on the Design of Multicore Processors
ICA3PP '09 Proceedings of the 9th International Conference on Algorithms and Architectures for Parallel Processing
Journal of Systems Architecture: the EUROMICRO Journal
SuSeSim: a fast simulation strategy to find optimal L1 cache configuration for embedded systems
CODES+ISSS '09 Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Analysis of network processing workloads
Journal of Systems Architecture: the EUROMICRO Journal
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Custom floating-point unit generation for embedded systems
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Reducing leakage power with BTB access prediction
Integration, the VLSI Journal
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Partially protected caches to reduce failures due to soft errors in multimedia applications
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Hardware acceleration for media/transaction applications in network processors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
WiCOM'09 Proceedings of the 5th International Conference on Wireless communications, networking and mobile computing
Journal of Signal Processing Systems
Improving both the performance benefits and speed of optimization phase sequence searches
Proceedings of the ACM SIGPLAN/SIGBED 2010 conference on Languages, compilers, and tools for embedded systems
Cache vulnerability equations for protecting data in embedded processor caches from soft errors
Proceedings of the ACM SIGPLAN/SIGBED 2010 conference on Languages, compilers, and tools for embedded systems
Compiler-assisted memory encryption for embedded processors
HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
Sunflower: full-system, embedded, microarchitecture evaluation
HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
A multiprocessor cache for massively parallel soc architectures
ARCS'07 Proceedings of the 20th international conference on Architecture of computing systems
Cryptographic side-channels from low-power cache memory
Cryptography and Coding'07 Proceedings of the 11th IMA international conference on Cryptography and coding
Folding active list for high performance and low power
ISHPC'05/ALPS'06 Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systems
Program phase detection based dynamic control mechanisms for pipeline stage unification adoption
ISHPC'05/ALPS'06 Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systems
Performance and energy efficient cache migrationapproach for thermal management in embedded systems
Proceedings of the 20th symposium on Great lakes symposium on VLSI
Studying compiler optimizations on superscalar processors through interval analysis
HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
Phase complexity surfaces: characterizing time-varying program behavior
HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
A power-aware hybrid RAM-CAM renaming mechanism for fast recovery
ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
B2P2: bounds based procedure placement for instruction TLB power reduction in embedded systems
Proceedings of the 13th International Workshop on Software & Compilers for Embedded Systems
Branch target buffer design for embedded processors
Microprocessors & Microsystems
Partitioning techniques for partially protected caches in resource-constrained embedded systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Proceedings of the 47th Design Automation Conference
Power-aware BTB for modern processors
Computers and Electrical Engineering
Asilomar'09 Proceedings of the 43rd Asilomar conference on Signals, systems and computers
Increasing throughput of a RISC architecture using arithmetic data value speculation
Asilomar'09 Proceedings of the 43rd Asilomar conference on Signals, systems and computers
A compiler-microarchitecture hybrid approach to soft error reduction for register files
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
A Simulation Framework for Rapid Analysis of Reconfigurable Computing Systems
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
DEW: a fast level 1 cache simulation approach for embedded processors with FIFO replacement policy
Proceedings of the Conference on Design, Automation and Test in Europe
Architectural support for low overhead detection of memory violations
Proceedings of the Conference on Design, Automation and Test in Europe
Eliminating false phase interactions to reduce optimization phase order search space
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
On improving performance and energy profiles of sparse scientific applications
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Conjugate gradient sparse solvers: performance-power characteristics
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Analysis of checksum-based execution schemes for pipelined processors
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Source-level timing annotation for fast and accurate TLM computation model generation
Proceedings of the 2010 Asia and South Pacific Design Automation Conference
Minimal Multi-threading: Finding and Removing Redundant Instructions in Multi-threaded Processors
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
A precise high-level power consumption model for embedded systems software
EURASIP Journal on Embedded Systems
A combined optimization method for tuning two-level memory hierarchy considering energy consumption
EURASIP Journal on Embedded Systems
An ESL approach for energy consumption analysis of cache memories in SoC platforms
International Journal of Reconfigurable Computing - Special issue on selected papers from the southern programmable logic conference (SPL2010)
Proceedings of the 16th Asia and South Pacific Design Automation Conference
A hybrid hardware--software technique to improve reliability in embedded processors
ACM Transactions on Embedded Computing Systems (TECS)
WCET analysis of modern processors using multi-criteria optimisation
Empirical Software Engineering
Architectural enhancements for network congestion control applications
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Thread Warping: Dynamic and Transparent Synthesis of Thread Accelerators
ACM Transactions on Design Automation of Electronic Systems (TODAES)
The shape of the processor design space and its implications for early stage explorations
ACMOS'05 Proceedings of the 7th WSEAS international conference on Automatic control, modeling and simulation
Parallelism and data movement characterization of contemporary application classes
Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
Journal of Computer Science and Technology
A multi-granularity power modeling methodology for embedded processors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Efficient code compression for embedded processors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A framework for correction of multi-bit soft errors in L2 caches based on redundancy
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Pinned to the walls: impact of packaging and application properties on the memory and power walls
Proceedings of the 17th IEEE/ACM international symposium on Low-power electronics and design
Hybrid analytical modeling of pending cache hits, data prefetching, and MSHRs
ACM Transactions on Architecture and Code Optimization (TACO)
TPM-SIM: a framework for performance evaluation of trusted platform modules
Proceedings of the 48th Design Automation Conference
A high-parallelism distributed scheduling mechanism for multi-core instruction-set simulation
Proceedings of the 48th Design Automation Conference
Thermal-aware floorplan schemes for reliable 3D multi-core processors
ICCSA'11 Proceedings of the 2011 international conference on Computational science and its applications - Volume Part II
Static speculation as post-link optimization for the Grid Alu processor
Euro-Par 2010 Proceedings of the 2010 conference on Parallel processing
Smart cache cleaning: energy efficient vulnerability reduction in embedded processors
CASES '11 Proceedings of the 14th international conference on Compilers, architectures and synthesis for embedded systems
Localizing globals and statics to make C programs thread-safe
CASES '11 Proceedings of the 14th international conference on Compilers, architectures and synthesis for embedded systems
Using silent writes in low-power traffic-aware ECC
PATMOS'11 Proceedings of the 21st international conference on Integrated circuit and system design: power and timing modeling, optimization, and simulation
Segmented bitline cache: exploiting non-uniform memory access patterns
HiPC'06 Proceedings of the 13th international conference on High Performance Computing
Exploiting video stream similarity for energy-efficient decoding
MMM'07 Proceedings of the 13th International conference on Multimedia Modeling - Volume Part II
ACSAC'06 Proceedings of the 11th Asia-Pacific conference on Advances in Computer Systems Architecture
The Java Virtual Machine in retargetable, high-performance instruction set simulation
Proceedings of the 9th International Conference on Principles and Practice of Programming in Java
2L-MuRR: a compact register renaming scheme for SMT processors
ISPA'05 Proceedings of the Third international conference on Parallel and Distributed Processing and Applications
Performance and power evaluation of an intelligently adaptive data cache
HiPC'05 Proceedings of the 12th international conference on High Performance Computing
Parallel branch prediction on GPU platform
HPCA'09 Proceedings of the Second international conference on High Performance Computing and Applications
CIPARSim: cache intersection property assisted rapid single-pass FIFO cache simulation technique
Proceedings of the International Conference on Computer-Aided Design
Proceedings of the International Conference on Computer-Aided Design
An offline approach for whole-program paths analysis using suffix arrays
LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
A detailed study on phase predictors
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Compiler analysis and supports for leakage power reduction on microprocessors
LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
Power-aware branch logic: a hardware based technique for filtering access to branch logic
SAMOS'05 Proceedings of the 5th international conference on Embedded Computer Systems: architectures, Modeling, and Simulation
Offline phase analysis and optimization for multi-configuration processors
SAMOS'05 Proceedings of the 5th international conference on Embedded Computer Systems: architectures, Modeling, and Simulation
Area-Aware pipeline gating for embedded processors
PATMOS'05 Proceedings of the 15th international conference on Integrated Circuit and System Design: power and Timing Modeling, Optimization and Simulation
Runtime biased pointer reuse analysis and its application to energy efficiency
PACS'03 Proceedings of the Third international conference on Power - Aware Computer Systems
An Extended SystemC Framework for Efficient HW/SW Co-Simulation
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Bit-sliced datapath for energy-efficient high performance microprocessors
PACS'04 Proceedings of the 4th international conference on Power-Aware Computer Systems
Heuristic for two-level cache hierarchy exploration considering energy consumption and performance
PATMOS'06 Proceedings of the 16th international conference on Integrated Circuit and System Design: power and Timing Modeling, Optimization and Simulation
System level multi-bank main memory configuration for energy reduction
PATMOS'06 Proceedings of the 16th international conference on Integrated Circuit and System Design: power and Timing Modeling, Optimization and Simulation
Characterizing time-varying program behavior using phase complexity surfaces
Transactions on High-Performance Embedded Architectures and Compilers IV
Finding extreme behaviors in microprocessor workloads
Transactions on High-Performance Embedded Architectures and Compilers IV
Path-Based reuse distance analysis
CC'06 Proceedings of the 15th international conference on Compiler Construction
Instruction set architectural guidelines for embedded packet-processing engines
Journal of Systems Architecture: the EUROMICRO Journal
Journal of Systems and Software
WCET-centric partial instruction cache locking
Proceedings of the 49th Annual Design Automation Conference
Run-time power-down strategies for real-time SDRAM memory controllers
Proceedings of the 49th Annual Design Automation Conference
Compiler support for value-based indirect branch prediction
CC'12 Proceedings of the 21st international conference on Compiler Construction
SST + gem5 = a scalable simulation infrastructure for high performance computing
Proceedings of the 5th International ICST Conference on Simulation Tools and Techniques
A Model Checking Based Approach to Bounding Worst-Case Execution Time for Multicore Processors
ACM Transactions on Embedded Computing Systems (TECS) - Special Section on CAPA'09, Special Section on WHS'09, and Special Section VCPSS' 09
XIOSim: power-performance modeling of mobile x86 cores
Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design
Enhancing effective throughput for transmission line-based bus
Proceedings of the 39th Annual International Symposium on Computer Architecture
A first-order mechanistic model for architectural vulnerability factor
Proceedings of the 39th Annual International Symposium on Computer Architecture
Power Modeling and Characterization of Computing Devices: A Survey
Foundations and Trends in Electronic Design Automation
Multi-level simultaneous multithreading scheduling to reduce the temperature of register files
Concurrency and Computation: Practice & Experience
Adaptive dynamic frequency scaling for thermal-aware 3d multi-core processors
ICCSA'12 Proceedings of the 12th international conference on Computational Science and Its Applications - Volume Part IV
Combining recency of information with selective random and a victim cache in last-level caches
ACM Transactions on Architecture and Code Optimization (TACO)
On abstractions for timing analysis in the K framework
FOPARA'11 Proceedings of the Second international conference on Foundational and Practical Aspects of Resource Analysis
Exploiting input variations for energy reduction
PATMOS'07 Proceedings of the 17th international conference on Integrated Circuit and System Design: power and timing modeling, optimization and simulation
Proceedings of the 2013 Workshop on Rapid Simulation and Performance Evaluation: Methods and Tools
A distributed timing synchronization technique for parallel multi-core instruction-set simulation
ACM Transactions on Embedded Computing Systems (TECS) - Special section on ESTIMedia'12, LCTES'11, rigorous embedded systems design, and multiprocessor system-on-chip for cyber-physical systems
ACM Transactions on Embedded Computing Systems (TECS)
ACM Transactions on Architecture and Code Optimization (TACO)
Towards a multiple-ISA embedded system
Journal of Systems Architecture: the EUROMICRO Journal
Web based multi-platform benchmark program construction in smartphone
Proceedings of the 7th International Conference on Ubiquitous Information Management and Communication
Combining RAM technologies for hard-error recovery in L1 data caches working at very-low power modes
Proceedings of the Conference on Design, Automation and Test in Europe
Bounding SDRAM interference: detailed analysis vs. latency-rate analysis
Proceedings of the Conference on Design, Automation and Test in Europe
Proceedings of the Conference on Design, Automation and Test in Europe
Integrated instruction cache analysis and locking in multitasking real-time systems
Proceedings of the 50th Annual Design Automation Conference
Energy-efficient branch prediction with compiler-guided history stack
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Reli: hardware/software checkpoint and recovery scheme for embedded processors
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
International Journal of High Performance Computing Applications
Enabling energy efficient reliability in embedded systems through smart cache cleaning
ACM Transactions on Design Automation of Electronic Systems (TODAES) - Special Section on Networks on Chip: Architecture, Tools, and Methodologies
Exploiting phase inter-dependencies for faster iterative compiler optimization phase order searches
Proceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems
Performance and power profiling for emulated Android systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Configurable range memory for effective data reuse on programmable accelerators
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Hi-index | 0.03 |
This document describes release 2.0 of the SimpleScalar tool set, a suite of free, publicly available simulation tools that offer both detailed and high-performance simulation of modern microprocessors. The new release offers more tools and capabilities, precompiled binaries, cleaner interfaces, better documentation, easier installation, improved portability, and higher performance. This paper contains a complete description of the tool set, including retrieval and installation instructions, a description of how to use the tools, a description of the target SimpleScalar architecture, and many details about the internals of the tools and how to customize them. With this guide, the tool set can be brought up and generating results in under an hour (on supported platforms).