Parallel algorithms for VLSI computer-aided design
Parallel algorithms for VLSI computer-aided design
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Memory system characterization of commercial workloads
Proceedings of the 25th annual international symposium on Computer architecture
Venti: A New Approach to Archival Storage
FAST '02 Proceedings of the Conference on File and Storage Technologies
Variability in Architectural Simulations of Multi-Threaded Workloads
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Particle-based fluid simulation for interactive applications
Proceedings of the 2003 ACM SIGGRAPH/Eurographics symposium on Computer animation
Articulated Body Motion Capture by Stochastic Search
International Journal of Computer Vision
Automatic determination of facial muscle activations from sparse motion capture marker data
ACM SIGGRAPH 2005 Papers
Ferret: a toolkit for content-based similarity search of feature-rich data
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Proceedings of the 34th annual international symposium on Computer architecture
ParallAX: an architecture for real-time physics
Proceedings of the 34th annual international symposium on Computer architecture
Overview of the H.264/AVC video coding standard
IEEE Transactions on Circuits and Systems for Video Technology
Serialization sets: a dynamic dependence-based parallel execution model
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Detecting and tolerating asymmetric races
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
PARSEC: hardware profiling of emerging workloads for CMP design
Proceedings of the 23rd international conference on Supercomputing
Load balancing using work-stealing for pipeline parallelism in emerging applications
Proceedings of the 23rd international conference on Supercomputing
LiteRace: effective sampling for lightweight data-race detection
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Hybrid cache architecture with disparate memory technologies
Proceedings of the 36th annual international symposium on Computer architecture
Proceedings of the 36th annual international symposium on Computer architecture
A case for an interleaving constrained shared-memory multi-processor
Proceedings of the 36th annual international symposium on Computer architecture
SigRace: signature-based data race detection
Proceedings of the 36th annual international symposium on Computer architecture
CPU, SMP and GPU implementations of Nohalo level 1, a fast co-convex antialiasing image resampler
C3S2E '09 Proceedings of the 2nd Canadian Conference on Computer Science and Software Engineering
Frequent itemset mining on graphics processors
Proceedings of the Fifth International Workshop on Data Management on New Hardware
vGreen: a system for energy efficient computing in virtualized environments
Proceedings of the 14th ACM/IEEE international symposium on Low power electronics and design
Best of both worlds: A bus enhanced NoC (BENoC)
NOCS '09 Proceedings of the 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip
Flow-aware allocation for on-chip networks
NOCS '09 Proceedings of the 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip
Last Bank: Dealing with Address Reuse in Non-Uniform Cache Architecture for CMPs
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Segment gating for static energy reduction in Networks-on-Chip
Proceedings of the 2nd International Workshop on Network on Chip Architectures
Future scaling of processor-memory interfaces
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
In-network coherence filtering: snoopy coherence without broadcasts
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Preemptive virtual clock: a flexible, efficient, and cost-effective QOS scheme for networks-on-chip
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
A case for dynamic frequency tuning in on-chip networks
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Ordering decoupled metadata accesses in multiprocessors
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Optimizing shared cache behavior of chip multiprocessors
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Finding concurrency bugs with context-aware communication graphs
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Operating system scheduling for efficient online self-test in robust systems
Proceedings of the 2009 International Conference on Computer-Aided Design
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Does cache sharing on modern CMP matter to the performance of contemporary multithreaded programs?
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
AASH: an asymmetry-aware scheduler for hypervisors
Proceedings of the 6th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
CoreDet: a compiler and runtime system for deterministic multithreaded execution
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Speculative parallelization using software multi-threaded transactions
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Respec: efficient online multiprocessor replayvia speculation and external determinism
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Decoupling contention management from scheduling
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
ParaLog: enabling and accelerating online parallel monitoring of multithreaded applications
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Inter-core cooperative TLB for chip multiprocessors
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Proceedings of the 7th ACM international conference on Computing frontiers
Applying statistical machine learning to multicore voltage & frequency scaling
Proceedings of the 7th ACM international conference on Computing frontiers
LRU-PEA: a smart replacement policy for non-uniform cache architectures on chip multiprocessors
ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
Cache topology aware computation mapping for multicores
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Green: a framework for supporting energy-conscious programming using controlled approximation
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
DRFX: a simple and efficient memory model for concurrent programming languages
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Evaluating iterative optimization across 1000 datasets
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Opportunities for concurrent dynamic analysis with explicit inter-core communication
Proceedings of the 9th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
Proceedings of the 7th international conference on Autonomic computing
The auction: optimizing banks usage in Non-Uniform Cache Architectures
Proceedings of the 24th ACM International Conference on Supercomputing
SAMS multi-layout memory: providing multiple views of data to boost SIMD performance
Proceedings of the 24th ACM International Conference on Supercomputing
An approach to resource-aware co-scheduling for CMPs
Proceedings of the 24th ACM International Conference on Supercomputing
Simplifying concurrent algorithms by exploiting hardware transactional memory
Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Silicon-photonic network architectures for scalable, power-efficient multi-chip systems
Proceedings of the 37th annual international symposium on Computer architecture
Proceedings of the 37th annual international symposium on Computer architecture
Thread tailor: dynamically weaving threads together for efficient, adaptive parallel applications
Proceedings of the 37th annual international symposium on Computer architecture
A case for FAME: FPGA architecture model execution
Proceedings of the 37th annual international symposium on Computer architecture
Data marshaling for multi-core architectures
Proceedings of the 37th annual international symposium on Computer architecture
Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU
Proceedings of the 37th annual international symposium on Computer architecture
Relax: an architectural framework for software recovery of hardware faults
Proceedings of the 37th annual international symposium on Computer architecture
Performance Evaluation of a Multicore System with Optically Connected Memory Modules
NOCS '10 Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip
Cost-driven 3D integration with interconnect layers
Proceedings of the 47th Design Automation Conference
Virtual channels vs. multiple physical networks: a comparative analysis
Proceedings of the 47th Design Automation Conference
RAMP gold: an FPGA-based architecture simulator for multiprocessors
Proceedings of the 47th Design Automation Conference
Extensible transactional memory testbed
Journal of Parallel and Distributed Computing
A practical way to extend shared memory support beyond a motherboard at low cost
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Accelerating multicore reuse distance analysis with sampling and parallelization
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Subspace snooping: filtering snoops with operating system support
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Proximity coherence for chip multiprocessors
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Feedback-directed pipeline parallelism
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Scalable hardware support for conditional parallelization
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
An OpenCL framework for heterogeneous multicores with local memory
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
ATAC: a 1000-core cache-coherent processor with on-chip optical network
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Scaling of the PARSEC benchmark inputs
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Distributed systems meet economics: pricing in the cloud
HotCloud'10 Proceedings of the 2nd USENIX conference on Hot topics in cloud computing
Patterns and statistical analysis for understanding reduced resource computing
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Inferring arbitrary distributions for data and computation
Proceedings of the ACM international conference companion on Object oriented programming systems languages and applications companion
Energy- and endurance-aware design of phase change memory caches
Proceedings of the Conference on Design, Automation and Test in Europe
Power and performance of read-write aware hybrid caches with non-volatile memories
Proceedings of the Conference on Design, Automation and Test in Europe
Balancing memory and performance through selective flushing of software code caches
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Design exploration of hybrid caches with disparate memory technologies
ACM Transactions on Architecture and Code Optimization (TACO)
A trace simplification technique for effective debugging of concurrent programs
Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
On-Chip Network Evaluation Framework
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
CPM in CMPs: Coordinated Power Management in Chip-Multiprocessors
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Optimizing power and performance for reliable on-chip networks
Proceedings of the 2010 Asia and South Pacific Design Automation Conference
Efficient throughput-guarantees for latency-sensitive networks-on-chip
Proceedings of the 2010 Asia and South Pacific Design Automation Conference
A generic adaptive path-based routing method for MPSoCs
Journal of Systems Architecture: the EUROMICRO Journal
Thread criticality support in on-chip networks
Proceedings of the Third International Workshop on Network on Chip Architectures
Deterministic process groups in dOS
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Efficient system-enforced deterministic parallelism
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Towards architecture independent metrics for multicore performance analysis
ACM SIGMETRICS Performance Evaluation Review
Proceedings of the 20th ACM SIGPLAN workshop on Partial evaluation and program manipulation
Tolerating Concurrency Bugs Using Transactions as Lifeguards
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Architectural Support for Fair Reader-Writer Locking
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Synergistic TLBs for High Performance Address Translation in Chip Multiprocessors
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Minimal Multi-threading: Finding and Removing Redundant Instructions in Multi-threaded Processors
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Pseudo-Circuit: Accelerating Communication for On-Chip Interconnection Networks
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Virtual Snooping: Filtering Snoops in Virtualized Multi-cores
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Probabilistic Distance-Based Arbitration: Providing Equality of Service for Many-Core CMPs
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Scalable Speculative Parallelization on Commodity Clusters
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Memory Latency Reduction via Thread Throttling
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
The ZCache: Decoupling Ways and Associativity
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Single-Chip Heterogeneous Computing: Does the Future Include Custom Logic, FPGAs, and GPGPUs?
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
InstantCheck: Checking the Determinism of Parallel Programs Using On-the-Fly Incremental Hashing
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Scientific Programming - Exploring Languages for Expressing Medium to Massive On-Chip Parallelism
Communications of the ACM
COREMU: a scalable and portable parallel full-system emulator
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Achieving a single compute device image in OpenCL for multiple GPUs
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
SWEEP: evaluating computer system energy efficiency using synthetic workloads
Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
Cache equalizer: a placement mechanism for chip multiprocessor distributed shared caches
Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
NoC-aware cache design for multithreaded execution on tiled chip multiprocessors
Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
Efficient processor support for DRFx, a memory model with exceptions
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
RCDC: a relaxed consistency deterministic computer
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Dynamic knobs for responsive power-aware computing
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Memory-efficient frequent-itemset mining
Proceedings of the 14th International Conference on Extending Database Technology
Adaptive timekeeping replacement: Fine-grained capacity management for shared CMP caches
ACM Transactions on Architecture and Code Optimization (TACO)
Parallelization libraries: Characterizing and reducing overheads
ACM Transactions on Architecture and Code Optimization (TACO)
RMS-TM: a comprehensive benchmark suite for transactional memory systems
Proceedings of the 2nd ACM/SPEC International Conference on Performance engineering
RAFT: A router architecture with frequency tuning for on-chip networks
Journal of Parallel and Distributed Computing
Characterizing the impact of process variation on 45 nm NoC-based CMPs
Journal of Parallel and Distributed Computing
Array regrouping on CMP with non-uniform cache sharing
LCPC'10 Proceedings of the 23rd international conference on Languages and compilers for parallel computing
Strategies for preparing computer science students for the multicore world
Proceedings of the 2010 ITiCSE working group reports
Run-time energy management of manycore systems through reconfigurable interconnects
Proceedings of the 21st edition of the great lakes symposium on Great lakes symposium on VLSI
Research note: C-AMTE: A location mechanism for flexible cache management in chip multiprocessors
Journal of Parallel and Distributed Computing
LIME: a framework for debugging load imbalance in multi-threaded execution
Proceedings of the 33rd International Conference on Software Engineering
Inflation and deflation of self-adaptive applications
Proceedings of the 6th International Symposium on Software Engineering for Adaptive and Self-Managing Systems
A study of transactional memory vs. locks in practice
Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
International Journal of Reconfigurable Computing - Special issue on selected papers from the 17th reconfigurable architectures workshop (RAW2010)
Parallelism orchestration using DoPE: the degree of parallelism executive
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Automatic CPU-GPU communication management and optimization
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
A case for an SC-preserving compiler
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Isolating and understanding concurrency errors using reconstructed execution fragments
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Synchronization via scheduling: techniques for efficiently managing shared state
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Optimizing the datacenter for data-centric workloads
Proceedings of the international conference on Supercomputing
A composite and scalable cache coherence protocol for large scale CMPs
Proceedings of the international conference on Supercomputing
Controlling cache utilization of HPC applications
Proceedings of the international conference on Supercomputing
Exploring partitioning methods for 3D Networks-on-Chip utilizing adaptive routing model
NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
Two-hop free-space based optical interconnects for chip multiprocessors
NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
A distributed and topology-agnostic approach for on-line NoC testing
NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
Delay analysis of wormhole based heterogeneous NoC
NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
Reducing Network-on-Chip energy consumption through spatial locality speculation
NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
A Comphrehensive Networks-on-Chip Simulator for Error Control Explorations
NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
Virtualizing performance asymmetric multi-core systems
Proceedings of the 38th annual international symposium on Computer architecture
Increasing the effectiveness of directory caches by deactivating coherence for private memory blocks
Proceedings of the 38th annual international symposium on Computer architecture
TLSync: support for multiple fast barriers using on-chip transmission lines
Proceedings of the 38th annual international symposium on Computer architecture
Demand-driven software race detection using hardware performance counters
Proceedings of the 38th annual international symposium on Computer architecture
Sampling + DMR: practical and low-overhead permanent fault detection
Proceedings of the 38th annual international symposium on Computer architecture
An abacus turn model for time/space-efficient reconfigurable routing
Proceedings of the 38th annual international symposium on Computer architecture
A case for globally shared-medium on-chip interconnect
Proceedings of the 38th annual international symposium on Computer architecture
Rapid identification of architectural bottlenecks via precise event counting
Proceedings of the 38th annual international symposium on Computer architecture
Dark silicon and the end of multicore scaling
Proceedings of the 38th annual international symposium on Computer architecture
Moguls: a model to explore the memory hierarchy for bandwidth improvements
Proceedings of the 38th annual international symposium on Computer architecture
A case for heterogeneous on-chip interconnects for CMPs
Proceedings of the 38th annual international symposium on Computer architecture
Kilo-NOC: a heterogeneous network-on-chip architecture for scalability and service guarantees
Proceedings of the 38th annual international symposium on Computer architecture
Scalable power control for many-core architectures running multi-threaded applications
Proceedings of the 38th annual international symposium on Computer architecture
Considerations when evaluating microprocessor platforms
HotPar'11 Proceedings of the 3rd USENIX conference on Hot topic in parallelism
RADBench: a concurrency bug benchmark suite
HotPar'11 Proceedings of the 3rd USENIX conference on Hot topic in parallelism
Deterministic OpenMP for race-free parallelism
HotPar'11 Proceedings of the 3rd USENIX conference on Hot topic in parallelism
Parallel pattern detection for architectural improvements
HotPar'11 Proceedings of the 3rd USENIX conference on Hot topic in parallelism
Mobile processors for energy-efficient web search
ACM Transactions on Computer Systems (TOCS)
Pruning hardware evaluation space via correlation-driven application similarity analysis
Proceedings of the 8th ACM International Conference on Computing Frontiers
A design space exploration of transmission-line links for on-chip interconnect
Proceedings of the 17th IEEE/ACM international symposium on Low-power electronics and design
NoC frequency scaling with flexible-pipeline routers
Proceedings of the 17th IEEE/ACM international symposium on Low-power electronics and design
Proceedings of the 48th Design Automation Conference
A helper thread based dynamic cache partitioning scheme for multithreaded applications
Proceedings of the 48th Design Automation Conference
MARSS: a full system simulator for multicore x86 CPUs
Proceedings of the 48th Design Automation Conference
Managing performance vs. accuracy trade-offs with loop perforation
Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
Programming heterogeneous multicore systems using threading building blocks
Euro-Par 2010 Proceedings of the 2010 conference on Parallel processing
Dynamic, multi-core cache coherence architecture for power-sensitive mobile processors
CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Optimal memory controller placement for chip multiprocessor
CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Probabilistically accurate program transformations
SAS'11 Proceedings of the 18th international conference on Static analysis
A read-write aware replacement policy for phase change memory
APPT'11 Proceedings of the 9th international conference on Advanced parallel processing technologies
A semi-automatic scratchpad memory management framework for CMP
APPT'11 Proceedings of the 9th international conference on Advanced parallel processing technologies
Energy efficient many-core processor for recognition and mining using spin-based memory
NANOARCH '11 Proceedings of the 2011 IEEE/ACM International Symposium on Nanoscale Architectures
System implications of memory reliability in exascale computing
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Sniper: exploring the level of abstraction for scalable and accurate parallel multi-core simulation
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
A study of 3D Network-on-Chip design for data parallel H.264 coding
Microprocessors & Microsystems
A minimal average accessing time scheduler for multicore processors
ICA3PP'11 Proceedings of the 11th international conference on Algorithms and architectures for parallel processing - Volume Part II
Bahurupi: A polymorphic heterogeneous multi-core architecture
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
FlexSig: Implementing flexible hardware signatures
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
The migration prefetcher: Anticipating data promotion in dynamic NUCA caches
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Thread Tranquilizer: Dynamically reducing performance variation
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
VSim: Simulating multi-server setups at near native hardware speed
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
A highly robust distributed fault-tolerant routing algorithm for NoCs with localized rerouting
Proceedings of the 2012 Interconnection Network Architecture: On-Chip, Multi-Chip Workshop
Bandwidth-aware reconfigurable cache design with hybrid memory technologies
Proceedings of the International Conference on Computer-Aided Design
Co-design of channel buffers and crossbar organizations in NoCs architectures
Proceedings of the International Conference on Computer-Aided Design
Improving System Energy Efficiency with Memory Rank Subsetting
ACM Transactions on Architecture and Code Optimization (TACO)
OpenCL as a unified programming model for heterogeneous CPU/GPU clusters
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
Clearing the clouds: a study of emerging scale-out workloads on modern hardware
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Iterative optimization for the data center
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
A case for unlimited watchpoints
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
REEact: a customizable virtual execution manager for multicore platforms
VEE '12 Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments
A systematic methodology to develop resilient cache coherence protocols
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Towards the ideal on-chip fabric for 1-to-many and many-to-1 communication
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Idempotent processor architecture
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Complementing user-level coarse-grain parallelism with implicit speculative parallelism
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Formally enhanced runtime verification to ensure NoC functional correctness
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Performance and power aware CMP thread allocation modeling
HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
Is reuse distance applicable to data locality analysis on chip multiprocessors?
CC'10/ETAPS'10 Proceedings of the 19th joint European conference on Theory and Practice of Software, international conference on Compiler Construction
Exploiting parallelism in deterministic shared memory multiprocessing
Journal of Parallel and Distributed Computing
Characteristics of workloads using the pipeline programming model
ISCA'10 Proceedings of the 2010 international conference on Computer Architecture
HPC performance domains on multi-core processors with virtualization
ARCS'12 Proceedings of the 25th international conference on Architecture of Computing Systems
Reliability-aware platform optimization for 3D chip multi-processors
The Journal of Supercomputing
Reliability-aware core partitioning in chip multiprocessors
Journal of Systems Architecture: the EUROMICRO Journal
Improving performance of adaptive component-based dataflow middleware
Parallel Computing
Transformer: a functional-driven cycle-accurate multicore simulator
Proceedings of the 49th Annual Design Automation Conference
Exploration of heuristic scheduling algorithms for 3D multicore processors
Proceedings of the 15th International Workshop on Software and Compilers for Embedded Systems
Boosting single thread performance in mobile processors via reconfigurable acceleration
ARC'12 Proceedings of the 8th international conference on Reconfigurable Computing: architectures, tools and applications
Reconfigurable multicore architecture for dynamic processor reallocation
ARC'12 Proceedings of the 8th international conference on Reconfigurable Computing: architectures, tools and applications
Proceedings of the 2012 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness
Trace-driven simulation of memory system scheduling in multithread application
Proceedings of the 2012 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness
Parcae: a system for flexible parallel execution
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Speculative separation for privatization and reductions
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Static analysis and compiler design for idempotent processing
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Effective parallelization of loops in the presence of I/O operations
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Automatic speculative DOALL for clusters
Proceedings of the Tenth International Symposium on Code Generation and Optimization
Dynamically managed data for CPU-GPU architectures
Proceedings of the Tenth International Symposium on Code Generation and Optimization
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
Proceedings of the 26th ACM international conference on Supercomputing
Hardware support for enforcing isolation in lock-based parallel programs
Proceedings of the 26th ACM international conference on Supercomputing
SnuCL: an OpenCL framework for heterogeneous CPU/GPU clusters
Proceedings of the 26th ACM international conference on Supercomputing
Minimizing the Data Transfer Time Using Multicore End-System Aware Flow Bifurcation
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Brief announcement: the problem based benchmark suite
Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
Waveperf: a benchmark generator for performance evaluation
ACM SIGBED Review - 2nd Workshop on Embed With Linux (EWiLi 2012)
Power Limitations and Dark Silicon Challenge the Future of Multicore
ACM Transactions on Computer Systems (TOCS)
Power-aware performance increase via core/uncore reinforcement control for chip-multiprocessors
Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design
Enhancing effective throughput for transmission line-based bus
Proceedings of the 39th Annual International Symposium on Computer Architecture
RADISH: always-on sound and complete Ra Detection in Software and Hardware
Proceedings of the 39th Annual International Symposium on Computer Architecture
Proceedings of the 39th Annual International Symposium on Computer Architecture
A defect-tolerant accelerator for emerging high-performance applications
Proceedings of the 39th Annual International Symposium on Computer Architecture
Can traditional programming bridge the Ninja performance gap for parallel computing applications?
Proceedings of the 39th Annual International Symposium on Computer Architecture
End-to-end sequential consistency
Proceedings of the 39th Annual International Symposium on Computer Architecture
A new degree of freedom for memory allocation in clusters
Cluster Computing
Do we need a crystal ball for task migration?
HotPar'12 Proceedings of the 4th USENIX conference on Hot Topics in Parallelism
A template library to integrate thread scheduling and locality management for NUMA multiprocessors
HotPar'12 Proceedings of the 4th USENIX conference on Hot Topics in Parallelism
Parakeet: a just-in-time parallel accelerator for python
HotPar'12 Proceedings of the 4th USENIX conference on Hot Topics in Parallelism
An OpenMP 3.1 validation testsuite
IWOMP'12 Proceedings of the 8th international conference on OpenMP in a Heterogeneous World
Optimizing heterogeneous NoC design
Proceedings of the International Workshop on System Level Interconnect Prediction
Thread vulnerability in parallel applications
Journal of Parallel and Distributed Computing
Dynamic QoS management for chip multiprocessors
ACM Transactions on Architecture and Code Optimization (TACO)
Deconstructing iterative optimization
ACM Transactions on Architecture and Code Optimization (TACO)
Memory optimization of dynamic binary translators for embedded systems
ACM Transactions on Architecture and Code Optimization (TACO)
Efficient implementation of globally-aware network flow control
Journal of Parallel and Distributed Computing
Power-aware multi-core simulation for early design stage hardware/software co-optimization
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
APCR: an adaptive physical channel regulator for on-chip interconnects
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Complexity-effective multicore coherence
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
HaLock: hardware-assisted lock contention detection in multithreaded applications
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Coalition threading: combining traditional andnon-traditional parallelism to maximize scalability
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
ViPZonE: OS-level memory variability-driven physical address zoning for energy savings
Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
A novel NoC-based design for fault-tolerance of last-level caches in CMPs
Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Quantifying the Mismatch between Emerging Scale-Out Applications and Modern Processors
ACM Transactions on Computer Systems (TOCS)
Comparison of Decision-Making Strategies for Self-Optimization in Autonomic Computing Systems
ACM Transactions on Autonomous and Adaptive Systems (TAAS) - Special Section: Extended Version of SASO 2011 Best Paper
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
ADAPT: A framework for coscheduling multithreaded programs
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Delta-compressed caching for overcoming the write bandwidth limitation of hybrid main memory
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Case studies of multi-core energy efficiency in task based programs
ICT-GLOW'12 Proceedings of the Second international conference on ICT as Key Technology against Global Warming
Power challenges may end the multicore era
Communications of the ACM
Exploring object-level parallelism on chip multi-processors
ICA3PP'12 Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part II
Verified integrity properties for safe approximate program transformations
PEPM '13 Proceedings of the ACM SIGPLAN 2013 workshop on Partial evaluation and program manipulation
Efficient Reuse Distance Analysis of Multicore Scaling for Loop-Based Parallel Programs
ACM Transactions on Computer Systems (TOCS)
Improving last level cache locality by integrating loop and data transformations
Proceedings of the International Conference on Computer-Aided Design
Functional post-silicon diagnosis and debug for networks-on-chip
Proceedings of the International Conference on Computer-Aided Design
Scalable deterministic replay in a parallel full-system emulator
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
ACM Transactions on Architecture and Code Optimization (TACO)
ACM Transactions on Architecture and Code Optimization (TACO)
Automatic generation of program affinity policies using machine learning
CC'13 Proceedings of the 22nd international conference on Compiler Construction
Paragon: QoS-aware scheduling for heterogeneous datacenters
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
HOTL: a higher order theory of locality
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Demand-based coordinated scheduling for SMP VMs
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Predicting Coherence Communication by Tracking Synchronization Points at Run Time
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Dynamic Acceleration of Multithreaded Program Critical Paths in Near-Threshold Systems
MICROW '12 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture Workshops
Low-Latency Mechanisms for Near-Threshold Operation of Private Caches in Shared Memory Multicores
MICROW '12 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture Workshops
CONCURRIT: a domain specific language for reproducing concurrency bugs
Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation
Conversion: multi-version concurrency control for main memory segments
Proceedings of the 8th ACM European Conference on Computer Systems
Safety-first approach to memory consistency models
Proceedings of the 2013 international symposium on memory management
Proceedings of the 23rd ACM international conference on Great lakes symposium on VLSI
Cherry-picking: exploiting process variations in dark-silicon homogeneous chip multi-processors
Proceedings of the Conference on Design, Automation and Test in Europe
Efficient software-based fault tolerance approach on multicore platforms
Proceedings of the Conference on Design, Automation and Test in Europe
Proactive aging management in heterogeneous NoCs through a criticality-driven routing approach
Proceedings of the Conference on Design, Automation and Test in Europe
Proceedings of the Conference on Design, Automation and Test in Europe
Modeling and analysis of fault-tolerant distributed memories for networks-on-chip
Proceedings of the Conference on Design, Automation and Test in Europe
Exploring memory consistency for massively-threaded throughput-oriented processors
Proceedings of the 40th Annual International Symposium on Computer Architecture
Reducing memory access latency with asymmetric DRAM bank organizations
Proceedings of the 40th Annual International Symposium on Computer Architecture
ZSim: fast and accurate microarchitectural simulation of thousand-core systems
Proceedings of the 40th Annual International Symposium on Computer Architecture
Studying multicore processor scaling via reuse distance analysis
Proceedings of the 40th Annual International Symposium on Computer Architecture
Criticality stacks: identifying critical threads in parallel programs using synchronization behavior
Proceedings of the 40th Annual International Symposium on Computer Architecture
The locality-aware adaptive cache coherence protocol
Proceedings of the 40th Annual International Symposium on Computer Architecture
A new perspective for efficient virtual-cache coherence
Proceedings of the 40th Annual International Symposium on Computer Architecture
On-the-fly pipeline parallelism
Proceedings of the twenty-fifth annual ACM symposium on Parallelism in algorithms and architectures
Proactive circuit allocation in multiplane NoCs
Proceedings of the 50th Annual Design Automation Conference
RISO: relaxed network-on-chip isolation for cloud processors
Proceedings of the 50th Annual Design Automation Conference
Dynamic voltage and frequency scaling for shared resources in multicore processor designs
Proceedings of the 50th Annual Design Automation Conference
HaDeS: architectural synthesis for heterogeneous dark silicon chip multi-processors
Proceedings of the 50th Annual Design Automation Conference
Hierarchical power management for asymmetric multi-core in dark silicon era
Proceedings of the 50th Annual Design Automation Conference
Co-tuning of a hybrid electronic-optical network for reducing energy consumption in embedded CMPs
Proceedings of the First International Workshop on Many-core Embedded Systems
Exploring the vulnerability of CMPs to soft errors with 3D stacked nonvolatile memory
ACM Journal on Emerging Technologies in Computing Systems (JETC)
Dynamically reconfigurable hybrid cache: an energy-efficient last-level cache design
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Modeling and design exploration of FBDRAM as on-chip memory
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Proceedings of the 4th Asia-Pacific Workshop on Systems
Ordering circuit establishment in multiplane NoCs
ACM Transactions on Design Automation of Electronic Systems (TODAES) - Special Section on Networks on Chip: Architecture, Tools, and Methodologies
Location-aware cache management for many-core processors with deep cache hierarchy
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Guide-copy: fast and silent migration of virtual machine for datacenters
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
In-network monitoring and control policy for DVFS of CMP networks-on-chip and last level caches
ACM Transactions on Design Automation of Electronic Systems (TODAES) - Special Section on Networks on Chip: Architecture, Tools, and Methodologies
Deflection routing in 3D network-on-chip with limited vertical bandwidth
ACM Transactions on Design Automation of Electronic Systems (TODAES) - Special Section on Networks on Chip: Architecture, Tools, and Methodologies
Distributed fair DRAM scheduling in network-on-chips architecture
Journal of Systems Architecture: the EUROMICRO Journal
Optimal placement of vertical connections in 3D Network-on-Chip
Journal of Systems Architecture: the EUROMICRO Journal
ForEVeR: A complementary formal and runtime verification approach to correct NoC functionality
ACM Transactions on Embedded Computing Systems (TECS) - Special Issue on Design Challenges for Many-Core Processors, Special Section on ESTIMedia'13 and Regular Papers
Düppel: retrofitting commodity operating systems to mitigate cache side channels in the cloud
Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security
ThermOS: system support for dynamic thermal management of chip multi-processors
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
SMT-centric power-aware thread placement in chip multiprocessors
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Fairness-aware scheduling on single-ISA heterogeneous multi-cores
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
An empirical model for predicting cross-core performance interference on multicore processors
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Jigsaw: scalable software-defined caches
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Building expressive, area-efficient coherence directories
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Racing and pacing to idle: an evaluation of heuristics for energy-aware resource allocation
Proceedings of the Workshop on Power-Aware Computing and Systems
Threadguide: profiler assisted application adaptation on CMP
Proceedings of the 5th IBM Collaborative Academia Research Exchange Workshop
Dynamic thread pinning for phase-based OpenMP programs
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
Flexible filters in stream programs
ACM Transactions on Embedded Computing Systems (TECS)
Use it or lose it: wear-out and lifetime in future chip multiprocessors
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
uDIREC: unified diagnosis and reconfiguration for frugal bypass of NoC faults
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Imbalanced cache partitioning for balanced data-parallel programs
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Large-reach memory management unit caches
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
A circuit-architecture co-optimization framework for exploring nonvolatile memory hierarchies
ACM Transactions on Architecture and Code Optimization (TACO)
Modeling the impact of permanent faults in caches
ACM Transactions on Architecture and Code Optimization (TACO)
Quasar: resource-efficient and QoS-aware cluster management
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
The benefit of SMT in the multi-core era: flexibility towards degrees of thread-level parallelism
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
REF: resource elasticity fairness with sharing incentives for multiprocessors
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Deterministic galois: on-demand, portable and parameterless
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Price theory based power management for heterogeneous multi-cores
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Locality-oblivious cache organization leveraging single-cycle multi-hop NoCs
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Post-compiler software optimization for reducing energy
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Estimating the Empirical Cost Function of Routines with Dynamic Workloads
Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
Dynamic and Adaptive Calling Context Encoding
Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
On self-tuning networks-on-chip for dynamic network-flow dominance adaptation
ACM Transactions on Embedded Computing Systems (TECS) - Special Section ESFH'12, ESTIMedia'11 and Regular Papers
Efficient deterministic multithreading without global barriers
Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming
Selecting representative benchmark inputs for exploring microprocessor design spaces
ACM Transactions on Architecture and Code Optimization (TACO)
ACM Transactions on Architecture and Code Optimization (TACO)
PCantorSim: Accelerating parallel architecture simulation through fractal-based sampling
ACM Transactions on Architecture and Code Optimization (TACO)
Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis
A generalized software framework for accurate and efficient management of performance goals
Proceedings of the Eleventh ACM International Conference on Embedded Software
QoS-Aware scheduling in heterogeneous datacenters with paragon
ACM Transactions on Computer Systems (TOCS)
The case of using multiple streams in streaming
International Journal of Automation and Computing
Improving platform energy: chip area trade-off in near-threshold computing environment
Proceedings of the International Conference on Computer-Aided Design
Thread-criticality aware dynamic cache reconfiguration in multi-core system
Proceedings of the International Conference on Computer-Aided Design
Dual partitioning multicasting for high-performance on-chip networks
Journal of Parallel and Distributed Computing
Direct distributed memory access for CMPs
Journal of Parallel and Distributed Computing
Exploiting replication to improve performances of NUCA-based CMP systems
ACM Transactions on Embedded Computing Systems (TECS) - Special Issue on Design Challenges for Many-Core Processors, Special Section on ESTIMedia'13 and Regular Papers
Ultra-low-power adder stage design for exascale floating point units
ACM Transactions on Embedded Computing Systems (TECS) - Special Issue on Design Challenges for Many-Core Processors, Special Section on ESTIMedia'13 and Regular Papers
PAIS: Parallelism-aware interconnect scheduling in multicores
ACM Transactions on Embedded Computing Systems (TECS) - Special Issue on Design Challenges for Many-Core Processors, Special Section on ESTIMedia'13 and Regular Papers
Post-silicon platform for the functional diagnosis and debug of networks-on-chip
ACM Transactions on Embedded Computing Systems (TECS) - Special Issue on Design Challenges for Many-Core Processors, Special Section on ESTIMedia'13 and Regular Papers
NoC-based fault-tolerant cache design in chip multiprocessors
ACM Transactions on Embedded Computing Systems (TECS) - Special Issue on Design Challenges for Many-Core Processors, Special Section on ESTIMedia'13 and Regular Papers
METEOR: Hybrid photonic ring-mesh network-on-chip for multicore architectures
ACM Transactions on Embedded Computing Systems (TECS) - Special Issue on Design Challenges for Many-Core Processors, Special Section on ESTIMedia'13 and Regular Papers
BCIBench: a benchmarking suite for EEG-based brain computer interface
Proceedings of the 11th Workshop on Optimizations for DSP and Embedded Systems
Virtual asymmetric multiprocessor for interactive performance of consolidated desktops
Proceedings of the 10th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Power Modeling for Heterogeneous Processors
Proceedings of Workshop on General Purpose Processing Using GPUs
HMTT: A hybrid hardware/software tracing system for bridging the DRAM access trace's semantic gap
ACM Transactions on Architecture and Code Optimization (TACO)
Endurance-aware cache line management for non-volatile caches
ACM Transactions on Architecture and Code Optimization (TACO)
Adaptive workload-aware task scheduling for single-ISA asymmetric multicore architectures
ACM Transactions on Architecture and Code Optimization (TACO)
DP&TB: a coherence filtering protocol for many-core chip multiprocessors
The Journal of Supercomputing
Eliminating unscalable communication in transaction processing
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.04 |
This paper presents and characterizes the Princeton Application Repository for Shared-Memory Computers (PARSEC), a benchmark suite for studies of Chip-Multiprocessors (CMPs). Previous available benchmarks for multiprocessors have focused on high-performance computing applications and used a limited number of synchronization methods. PARSEC includes emerging applications in recognition, mining and synthesis (RMS) as well as systems applications which mimic large-scale multithreaded commercial programs. Our characterization shows that the benchmark suite covers a wide spectrum of working sets, locality, data sharing, synchronization and off-chip traffic. The benchmark suite has been made available to the public.