Ensuring operating system kernel integrity with OSck
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Evaluating the effectiveness of model-based power characterization
USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
TRACON: interference-aware scheduling for data-intensive applications in virtualized environments
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
ABS: A low-cost adaptive controller for prefetching in a banked shared last-level cache
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Aikido: accelerating shared data dynamic analyses
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Scalable address spaces using RCU balanced trees
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Packet chaining: efficient single-cycle allocation for on-chip networks
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the 9th conference on Computing Frontiers
Proceedings of the 49th Annual Design Automation Conference
Metronome: operating system level performance management via self-adaptive computing
Proceedings of the 49th Annual Design Automation Conference
Multicore acceleration of priority-based schedulers for concurrency bug detection
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Harmony: collection and analysis of parallel block vectors
Proceedings of the 39th Annual International Symposium on Computer Architecture
ACM Transactions on Architecture and Code Optimization (TACO)
A scalability benchmark suite for Erlang/OTP
Proceedings of the eleventh ACM SIGPLAN workshop on Erlang workshop
Scalability-based manycore partitioning
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Energy-efficient cache partitioning for future CMPs
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Comparison of Decision-Making Strategies for Self-Optimization in Autonomic Computing Systems
ACM Transactions on Autonomous and Adaptive Systems (TAAS) - Special Section: Extended Version of SASO 2011 Best Paper
IFRit: interference-free regions for dynamic data-race detection
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Accurate characterization of the variability in power consumption in modern mobile processors
HotPower'12 Proceedings of the 2012 USENIX conference on Power-Aware Computing and Systems
Legion: expressing locality and independence with logical regions
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
What scientific applications can benefit from hardware transactional memory?
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Probabilistic design methodology to improve run-time stability and performance of STT-RAM caches
Proceedings of the International Conference on Computer-Aided Design
To hardware prefetch or not to prefetch?: a virtualized environment study and core binding approach
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
AUDIT: Stress Testing the Automatic Way
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Amoeba-Cache: Adaptive Blocks for Eliminating Waste in the Memory Hierarchy
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Supporting parallel soft real-time applications in virtualized environment
Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
Reuse-based online models for caches
Proceedings of the ACM SIGMETRICS/international conference on Measurement and modeling of computer systems
Self-adaptive hybrid dynamic power management for many-core systems
Proceedings of the Conference on Design, Automation and Test in Europe
Energy-efficient multicore chip design through cross-layer approach
Proceedings of the Conference on Design, Automation and Test in Europe
Cache coherence enabled adaptive refresh for volatile STT-RAM
Proceedings of the Conference on Design, Automation and Test in Europe
Proceedings of the Conference on Design, Automation and Test in Europe
Proceedings of the 40th Annual International Symposium on Computer Architecture
Protozoa: adaptive granularity cache coherence
Proceedings of the 40th Annual International Symposium on Computer Architecture
Micro-architectural support for metadata coherence in multi-core dynamic information flow tracking
Proceedings of the 2nd International Workshop on Hardware and Architectural Support for Security and Privacy
The autonomic operating system research project: achievements and future directions
Proceedings of the 50th Annual Design Automation Conference
Analysis and characterization of inherent application resilience for approximate computing
Proceedings of the 50th Annual Design Automation Conference
Systematic evaluation of workload clustering for extremely energy-efficient architectures
ACM SIGARCH Computer Architecture News
APE: accelerator processor extensions to optimize data-compute co-location
Proceedings of the ACM SIGPLAN Workshop on Memory Systems Performance and Correctness
Analysis and runtime management of 3D systems with stacked DRAM for boosting energy efficiency
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Deterministic scale-free pipeline parallelism with hyperqueues
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Low-power, low-storage-overhead chipkill correct via multi-line error correction
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Journal of Systems Architecture: the EUROMICRO Journal
Language support for dynamic, hierarchical data partitioning
Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications
Coordinated power-performance optimization in manycores
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
SMT-centric power-aware thread placement in chip multiprocessors
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Towards a performance-as-a-service cloud
Proceedings of the 4th annual Symposium on Cloud Computing
Coloring the cloud for predictable performance
Proceedings of the 4th annual Symposium on Cloud Computing
Low-energy volatile STT-RAM cache design using cache-coherence-enabled adaptive refresh
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Crank it up or dial it down: coordinated multiprocessor frequency and folding control
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
The reuse cache: downsizing the shared last-level cache
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Multi-grain coherence directories
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
REF: resource elasticity fairness with sharing incentives for multiprocessors
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
DrDebug: Deterministic Replay based Cyclic Debugging with Dynamic Slicing
Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
On the advantage of time-varying diversity of workload on functionally asymmetric multi-core
Proceedings of International Workshop on Adaptive Self-tuning Computing Systems
Concurrency testing using schedule bounding: an empirical study
Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming
A tool to analyze the performance of multithreaded programs on NUMA architectures
Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming
TornadoNoC: A lightweight and scalable on-chip network architecture for the many-core era
ACM Transactions on Architecture and Code Optimization (TACO)
Analysis of dependence tracking algorithms for task dataflow execution
ACM Transactions on Architecture and Code Optimization (TACO)
Proceedings of Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures and Design Tools and Architectures for Multicore Embedded Computing Platforms
ACM Transactions on Embedded Computing Systems (TECS)
The case of using multiple streams in streaming
International Journal of Automation and Computing
Dynamic server power capping for enabling data center participation in power markets
Proceedings of the International Conference on Computer-Aided Design
Agent-based distributed power management for kilo-core processors
Proceedings of the International Conference on Computer-Aided Design
A column parity based fault detection mechanism for FIFO buffers
Integration, the VLSI Journal
Ultra-low-power adder stage design for exascale floating point units
ACM Transactions on Embedded Computing Systems (TECS) - Special Issue on Design Challenges for Many-Core Processors, Special Section on ESTIMedia'13 and Regular Papers
Exploiting multi-core nodes in peer-to-peer grids
Journal of Parallel and Distributed Computing
Aggressive Value Prediction on a GPU
International Journal of Parallel Programming
A performance-aware quality of service-driven scheduler for multicore processors
ACM SIGBED Review - Special Issue on the 3rd Embedded Operating System Workshop (EWiLi 2013)
Hi-index | 0.00 |
Benchmarking has become one of the most important methods for quantitative performance evaluation of processor and computer system designs. Benchmarking of modern multiprocessors such as chip multiprocessors is challenging because of their application domain, scalability and parallelism requirements. In my thesis, I have developed a methodology to design effective benchmark suites and demonstrated its effectiveness by developing and deploying a benchmark suite for evaluating multiprocessors. More specifically, this thesis includes several contributions. First, the thesis shows that a new benchmark suite for multiprocessors is needed because the behavior of modern parallel programs is significantly different from those represented by SPLASH-2, the most popular parallel benchmark suite developed over ten years ago. Second, the thesis quantitatively describes the requirements and characteristics of a set of multithreaded programs and their underlying technology trends. Third, the thesis presents a systematic approach to scale and select benchmark inputs with the goal of optimizing benchmarking accuracy subject to constrained execution or simulation time. Finally, the thesis describes a parallel benchmark suite called PARSEC for evaluating modern shared-memory multiprocessors. Since its initial release, PARSEC has been adopted by many architecture groups in both research and industry.