Using the SimOS machine simulator to study complex computer systems
ACM Transactions on Modeling and Computer Simulation (TOMACS)
A virtual machine emulator for performance evaluation
Communications of the ACM
Communications of the ACM
A Design for Efficient Simulation of a Multiprocessor
MASCOTS '93 Proceedings of the International Workshop on Modeling, Analysis, and Simulation On Computer and Telecommunication Systems
Using complete machine simulation to understand computer system behavior
Using complete machine simulation to understand computer system behavior
Full-system timing-first simulation
SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
High-level modeling and FPGA prototyping of microprocessors
FPGA '03 Proceedings of the 2003 ACM/SIGDA eleventh international symposium on Field programmable gate arrays
SMP system interconnect instrumentation for performance analysis
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Quantifying instruction criticality for shared memory multiprocessors
Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures
Memory System Behavior of Java-Based Middleware
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Variability in Architectural Simulations of Multi-Threaded Workloads
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
A "flight data recorder" for enabling full-system multiprocessor deterministic replay
Proceedings of the 30th annual international symposium on Computer architecture
Token coherence: decoupling performance and correctness
Proceedings of the 30th annual international symposium on Computer architecture
Proceedings of the 30th annual international symposium on Computer architecture
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Collecting whole-system reference traces of multiprogrammed and multithreaded workloads
WOSP '04 Proceedings of the 4th international workshop on Software and performance
Locality-Based Online Trace Compression
IEEE Transactions on Computers
Cycle-accurate power analysis for multiprocessor systems-on-a-chip
Proceedings of the 14th ACM Great Lakes symposium on VLSI
The design and implementation of FIT: a flexible instrumentation toolkit
Proceedings of the 5th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering
Adaptive Cache Compression for High-Performance Processors
Proceedings of the 31st annual international symposium on Computer architecture
Fingerprinting: bounding soft-error detection latency and bandwidth
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Performance directed energy management for main memory and disks
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Compiler Estimation of Load Imbalance Overhead in Speculative Parallelization
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Automatic Synthesis of High-Speed Processor Simulators
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Pinpointing Representative Portions of Large Intel® Itanium® Programs with Dynamic Instrumentation
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Managing Wire Delay in Large Chip-Multiprocessor Caches
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Memory Controller Optimizations for Web Servers
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Instruction Scheduling for Dynamic Hardware Configurations
Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Prediction model for evaluation of reconfigurable interconnects in distributed shared-memory systems
Proceedings of the 2005 international workshop on System level interconnect prediction
Exploiting Barriers to Optimize Power Consumption of CMPs
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Traffic Temporal Analysis for Reconfigurable Interconnects in Shared-Memory Systems
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 3 - Volume 04
ACM SIGMETRICS Performance Evaluation Review - Special issue on tools for computer architecture research
Memory coherence activity prediction in commercial workloads
WMPI '04 Proceedings of the 3rd workshop on Memory performance issues: in conjunction with the 31st international symposium on computer architecture
A flexible simulation framework for graphics architectures
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
Skewed caches from a low-power perspective
Proceedings of the 2nd conference on Computing frontiers
A serializability violation detector for shared-memory server programs
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Temporal Streaming of Shared Memory
Proceedings of the 32nd annual international symposium on Computer Architecture
BugNet: Continuously Recording Program Execution for Deterministic Replay Debugging
Proceedings of the 32nd annual international symposium on Computer Architecture
Optimizing Replication, Communication, and Capacity Allocation in CMPs
Proceedings of the 32nd annual international symposium on Computer Architecture
Dynamic Verification of Sequential Consistency
Proceedings of the 32nd annual international symposium on Computer Architecture
MPARM: Exploring the Multi-Processor SoC Design Space with SystemC
Journal of VLSI Signal Processing Systems
Evaluating IA-32 web servers through simics: a practical experience
Journal of Systems Architecture: the EUROMICRO Journal
Energy reduction in multiprocessor systems using transactional memory
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Communication Benchmarking and Performance Modelling of MPI Programs on Cluster Computers
The Journal of Supercomputing
Performance directed energy management for main memory and disks
ACM Transactions on Storage (TOS)
Profiling soft-core processor applications for hardware/software partitioning
Journal of Systems Architecture: the EUROMICRO Journal
Maximizing CMP Throughput with Mediocre Cores
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Store-Ordered Streaming of Shared Memory
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Performance Analysis of System Overheads in TCP/IP Workloads
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Mondrix: memory isolation for linux using mondriaan memory protection
Proceedings of the twentieth ACM symposium on Operating systems principles
Detecting past and present intrusions through vulnerability-specific predicates
Proceedings of the twentieth ACM symposium on Operating systems principles
Simulating Commercial Java Throughput Workloads: A Case Study
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Evaluating the impact of the simulation environment on experimentation results
Performance Evaluation
The RASE (Rapid, Accurate Simulation Environment) for chip multiprocessors
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
A chip prototyping substrate: the flexible architecture for simulation and testing (FAST)
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Simulation of Computer Architectures: Simulators, Benchmarks, Methodologies, and Recommendations
IEEE Transactions on Computers
Congestion modeling for reconfigurable inter-processor networks
Proceedings of the 2006 international workshop on System-level interconnect prediction
Profiling of parallel processing programs on shared memory multiprocessors using Simics
ACM SIGARCH Computer Architecture News - Special issue on the 2005 workshop on binary instrumentation and application
Compiler-driven FPGA-area allocation for reconfigurable computing
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Design and Management of 3D Chip Multiprocessors Using Network-in-Memory
Proceedings of the 33rd annual international symposium on Computer Architecture
Cooperative Caching for Chip Multiprocessors
Proceedings of the 33rd annual international symposium on Computer Architecture
Interconnect-Aware Coherence Protocols for Chip Multiprocessors
Proceedings of the 33rd annual international symposium on Computer Architecture
Spin Detection Hardware for Improved Management of Multithreaded Systems
IEEE Transactions on Parallel and Distributed Systems
Automatic logging of operating system effects to guide application-level architecture simulation
SIGMETRICS '06/Performance '06 Proceedings of the joint international conference on Measurement and modeling of computer systems
Optimizing locality and scalability of embedded Runge--Kutta solvers using block-based pipelining
Journal of Parallel and Distributed Computing
Optimizing bus energy consumption of on-chip multiprocessors using frequent values
Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Parallel, distributed and network-based processing
The design and utility of the ML-RSIM system simulator
Journal of Systems Architecture: the EUROMICRO Journal
Architectural support for operating system-driven CMP cache management
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Hardware support for spin management in overcommitted virtual machines
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
The M5 Simulator: Modeling Networked Systems
IEEE Micro
Reducing power through compiler-directed barrier synchronization elimination
Proceedings of the 2006 international symposium on Low power electronics and design
Improving instruction cache performance in OLTP
ACM Transactions on Database Systems (TODS)
A regulated transitive reduction (RTR) for longer memory race recording
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Recording shared memory dependencies using strata
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Computation spreading: employing hardware migration to specialize CMP cores on-the-fly
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Unbounded page-based transactional memory
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Supporting nested transactional memory in logTM
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Hardware tansactional memory support for lightweight dynamic language evolution
Companion to the 21st ACM SIGPLAN symposium on Object-oriented programming systems, languages, and applications
Software—Practice & Experience
Multigrid and Gauss-Seidel smoothers revisited: parallelization on chip multiprocessors
Proceedings of the 20th annual international conference on Supercomputing
TMA: a trap-based memory architecture
Proceedings of the 20th annual international conference on Supercomputing
Architecture of a Self-Checkpointing Microprocessor that Incorporates Nanomagnetic Devices
IEEE Transactions on Computers
Coherence Ordering for Ring-based Chip Multiprocessors
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
The Molen compiler for reconfigurable processors
ACM Transactions on Embedded Computing Systems (TECS)
SMP-SoC is the answer if you ask the right questions
SAICSIT '06 Proceedings of the 2006 annual research conference of the South African institute of computer scientists and information technologists on IT research in developing countries
An FPGA-based Pentium® in a complete desktop system
Proceedings of the 2007 ACM/SIGDA 15th international symposium on Field programmable gate arrays
TCP offload through connection handoff
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Synthetic traffic generation as a tool for dynamic interconnect evaluation
Proceedings of the 2007 international workshop on System level interconnect prediction
CMP cache performance projection: accessibility vs. capacity
ACM SIGARCH Computer Architecture News
Starvation-free commit arbitration policies for transactional memory systems
ACM SIGARCH Computer Architecture News
Advanced firmware verification using a code simulator for the IBM System z9
IBM Journal of Research and Development
Unichos: a full system simulator for thin client platform
Proceedings of the 2007 ACM symposium on Applied computing
Debugging operating systems with time-traveling virtual machines
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
Predicting reconfigurable interconnect performance in distributed shared-memory systems
Integration, the VLSI Journal
Making the fast case common and the uncommon case simple in unbounded transactional memory
Proceedings of the 34th annual international symposium on Computer architecture
Performance pathologies in hardware transactional memory
Proceedings of the 34th annual international symposium on Computer architecture
Rotary router: an efficient architecture for CMP interconnection networks
Proceedings of the 34th annual international symposium on Computer architecture
A novel dimensionally-decomposed router for on-chip communication in 3D architectures
Proceedings of the 34th annual international symposium on Computer architecture
ParallAX: an architecture for real-time physics
Proceedings of the 34th annual international symposium on Computer architecture
Architectural implications of brick and mortar silicon manufacturing
Proceedings of the 34th annual international symposium on Computer architecture
VPC prediction: reducing the cost of indirect branches via hardware-based dynamic devirtualization
Proceedings of the 34th annual international symposium on Computer architecture
A compiler cost model for speculative parallelization
ACM Transactions on Architecture and Code Optimization (TACO)
Profile-driven energy reduction in network-on-chips
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Establishing the genuinity of remote computer systems
SSYM'03 Proceedings of the 12th conference on USENIX Security Symposium - Volume 12
Understanding data lifetime via whole system simulation
SSYM'04 Proceedings of the 13th conference on USENIX Security Symposium - Volume 13
I/O processing in a virtualized platform: a simulation-driven approach
Proceedings of the 3rd international conference on Virtual execution environments
PinOS: a programmable framework for whole-system dynamic instrumentation
Proceedings of the 3rd international conference on Virtual execution environments
Managing energy-performance tradeoffs for multithreaded applications on multiprocessor architectures
Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
HW-SW emulation framework for temperature-aware design in MPSoCs
ACM Transactions on Design Automation of Electronic Systems (TODAES)
The Power of Priority: NoC Based Distributed Cache Coherency
NOCS '07 Proceedings of the First International Symposium on Networks-on-Chip
Concepts and components of full-system simulation of distributed memory parallel computers
Proceedings of the 16th international symposium on High performance distributed computing
Cooperative cache partitioning for chip multiprocessors
Proceedings of the 21st annual international conference on Supercomputing
RiceNIC: a reconfigurable network interface for experimental research and education
Proceedings of the 2007 workshop on Experimental computer science
A detailed performance analysis of UDP/IP, TCP/IP, and M-VIA network protocols using Linux/SimOS
Journal of High Speed Networks
TxLinux: using and managing hardware transactional memory in an operating system
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Connection handoff policies for TCP offload network interfaces
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Applying Statistical Sampling for Fast and Efficient Simulation of Commercial Workloads
IEEE Transactions on Computers
The impact of wrong-path memory references in cache-coherent multiprocessor systems
Journal of Parallel and Distributed Computing
Steps towards cache-resident transaction processing
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Characterization of Apache web server with Specweb2005
MEDEA '07 Proceedings of the 2007 workshop on MEmory performance: DEaling with Applications, systems and architecture
A Desktop Computer with a Reconfigurable Pentium®
ACM Transactions on Reconfigurable Technology and Systems (TRETS) - Special edition on the 15th international symposium on FPGAs
Parallelization of IBM mambo system simulator in functional modes
ACM SIGOPS Operating Systems Review
Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays
Hardbound: architectural support for spatial safety of the C programming language
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Adaptive set pinning: managing shared caches in chip multiprocessors
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Adapting to intermittent faults in multicore systems
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
A superscalar simulation employing poisson distributed stalls
Computers and Electrical Engineering
Rent's rule and parallel programs: characterizing network traffic behavior
Proceedings of the 2008 international workshop on System level interconnect prediction
Is the optimism in optimistic concurrency warranted?
HOTOS'07 Proceedings of the 11th USENIX workshop on Hot topics in operating systems
A case for low-complexity MP architectures
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Power management of variation aware chip multiprocessors
Proceedings of the 18th ACM Great Lakes symposium on VLSI
ILP-Based energy minimization techniques for banked memories
ACM Transactions on Design Automation of Electronic Systems (TODAES)
HMTT: a platform independent full-system memory trace monitoring system
SIGMETRICS '08 Proceedings of the 2008 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Software-directed combined cpu/link voltage scaling fornoc-based cmps
SIGMETRICS '08 Proceedings of the 2008 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Implementation and evaluation of a migration-based NUCA design for chip multiprocessors
SIGMETRICS '08 Proceedings of the 2008 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
FaCSim: a fast and cycle-accurate architecture simulator for embedded systems
Proceedings of the 2008 ACM SIGPLAN-SIGBED conference on Languages, compilers, and tools for embedded systems
Utilizing shared data in chip multiprocessors with the Nahalal architecture
Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
MIRA: A Multi-layered On-Chip Interconnect Router Architecture
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Reducing the Interconnection Network Cost of Chip Multiprocessors
NOCS '08 Proceedings of the Second ACM/IEEE International Symposium on Networks-on-Chip
SP-NUCA: a cost effective dynamic non-uniform cache architecture
ACM SIGARCH Computer Architecture News
Hardware monitors for dynamic page migration
Journal of Parallel and Distributed Computing
A novel migration-based NUCA design for chip multiprocessors
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Extending CC-NUMA systems to support write update optimizations
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
A framework for end-to-end simulation of high-performance computing systems
Proceedings of the 1st international conference on Simulation tools and techniques for communications, networks and systems & workshops
Synchronized network emulation: matching prototypes with complex simulations
ACM SIGMETRICS Performance Evaluation Review
Characterizing and modeling the behavior of context switch misses
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Distributed cooperative caching
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Improving support for locality and fine-grain sharing in chip multiprocessors
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Protocol offload analysis by simulation
Journal of Systems Architecture: the EUROMICRO Journal
A compiler-directed data prefetching scheme for chip multiprocessors
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Communication Based Proactive Link Power Management
HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
In-Network Caching for Chip Multiprocessors
HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
Capo: a software-hardware interface for practical deterministic multiprocessor replay
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Maximum benefit from a minimal HTM
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Mixed-mode multicore reliability
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Version management alternatives for hardware transactional memory
Proceedings of the 9th workshop on MEmory performance: DEaling with Applications, systems and architecture
Hybrid-compiled simulation: An efficient technique for instruction-set architecture simulation
ACM Transactions on Embedded Computing Systems (TECS)
MC-Sim: an efficient simulation tool for MPSoC designs
Proceedings of the 2008 IEEE/ACM International Conference on Computer-Aided Design
Integrated code and data placement in two-dimensional mesh based chip multiprocessors
Proceedings of the 2008 IEEE/ACM International Conference on Computer-Aided Design
Frequent value compression in packet-based NoC architectures
Proceedings of the 2009 Asia and South Pacific Design Automation Conference
Token tenure: PATCHing token counting using directory-based cache coherence
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
SHARK: Architectural support for autonomic protection against stealth by rootkit exploits
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Testudo: Heavyweight security analysis via statistical sampling
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Adaptive data compression for high-performance low-power on-chip networks
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Notary: Hardware techniques to enhance signatures
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Dependence-aware transactional memory for increased concurrency
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Power reduction of CMP communication networks via RF-interconnects
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Extending concurrency of transactional memory programs by using value prediction
Proceedings of the 6th ACM conference on Computing frontiers
Dynamic heterogeneity and the need for multicore virtualization
ACM SIGOPS Operating Systems Review
ProtoFlex: Towards Scalable, Full-System Multiprocessor Simulations Using FPGAs
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Trace-driven co-simulation of high-performance computing systems using OMNeT++
Proceedings of the 2nd International Conference on Simulation Tools and Techniques
Limited early value communication to improve performance of transactional memory
Proceedings of the 23rd international conference on Supercomputing
Towards device emulation code generation
Proceedings of the 2009 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Precise simulation of interrupts using a rollback mechanism
Proceedings of th 12th International Workshop on Software and Compilers for Embedded Systems
A durable and energy efficient main memory using phase change memory technology
Proceedings of the 36th annual international symposium on Computer architecture
Memory mapped ECC: low-cost error protection for last level caches
Proceedings of the 36th annual international symposium on Computer architecture
Flexible reference-counting-based hardware acceleration for garbage collection
Proceedings of the 36th annual international symposium on Computer architecture
Triplet-based topology for on-chip networks
WSEAS Transactions on Computers
Proceedings of the eighteenth international symposium on Software testing and analysis
Improving the Performance of Bandwidth-Demanding Applications by a Distributed Network Interface
IWANN '09 Proceedings of the 10th International Work-Conference on Artificial Neural Networks: Part II: Distributed Computing, Artificial Intelligence, Bioinformatics, Soft Computing, and Ambient Assisted Living
ACM SIGARCH Computer Architecture News
NZTM: nonblocking zero-indirection transactional memory
Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures
Exploration of 3D stacked L2 cache design for high performance and efficient thermal control
Proceedings of the 14th ACM/IEEE international symposium on Low power electronics and design
PPPJ '09 Proceedings of the 7th International Conference on Principles and Practice of Programming in Java
Best of both worlds: A bus enhanced NoC (BENoC)
NOCS '09 Proceedings of the 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip
Flow-aware allocation for on-chip networks
NOCS '09 Proceedings of the 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip
Dealing with Traffic-Area Trade-Off in Direct Coherence Protocols for Many-Core CMPs
APPT '09 Proceedings of the 8th International Symposium on Advanced Parallel Processing Technologies
Hybrid Techniques for Fast Multicore Simulation
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Last Bank: Dealing with Address Reuse in Non-Uniform Cache Architecture for CMPs
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
REPAS: Reliable Execution for Parallel ApplicationS in Tiled-CMPs
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
CheckerCore: enhancing an FPGA soft core to capture worst-case execution times
CASES '09 Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systems
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
MPTLsim: a simulator for X86 multicore processors
Proceedings of the 46th Annual Design Automation Conference
Dynamic thread and data mapping for NoC based CMPs
Proceedings of the 46th Annual Design Automation Conference
Interconnection network simulation using traces of MPI applications
International Journal of Parallel Programming
Full-system simulation of distributed memory multicomputers
Cluster Computing
A performance evaluation of 2D-mesh, ring, and crossbar interconnects for chip multi-processors
Proceedings of the 2nd International Workshop on Network on Chip Architectures
Reconsidering algorithms for iterative solvers in the multicore era
International Journal of Computational Science and Engineering
A systematic approach to profiling for hardware/software partitioning
Computers and Electrical Engineering
A case for integrated processor-cache partitioning in chip multiprocessors
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Flexible cache error protection using an ECC FIFO
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Using data compression for increasing memory system utilization
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
A case for dynamic frequency tuning in on-chip networks
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
SHARP control: controlled shared cache management in chip multiprocessors
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Offline symbolic analysis for multi-processor execution replay
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Energy reduction for STT-RAM using early write termination
Proceedings of the 2009 International Conference on Computer-Aided Design
Intra-application shared cache partitioning for multithreaded applications
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Modeling transactional memory workload performance
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Introspection of a Java™ virtual machine under simulation
Introspection of a Java™ virtual machine under simulation
Finding representative workloads for computer system design
Finding representative workloads for computer system design
A cross-layer approach to heterogeneity and reliability
MEMOCODE'09 Proceedings of the 7th IEEE/ACM international conference on Formal Methods and Models for Codesign
MEMOCODE'09 Proceedings of the 7th IEEE/ACM international conference on Formal Methods and Models for Codesign
Accurately evaluating application performance in simulated hybrid multi-tasking systems
Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays
Micro-pages: increasing DRAM efficiency with locality-aware data placement
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Specifying and dynamically verifying address translation-aware memory consistency
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Virtualized and flexible ECC for main memory
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Network interfaces for programmable NICs and multicore platforms
Computer Networks: The International Journal of Computer and Telecommunications Networking
A scalable organization for distributed directories
Journal of Systems Architecture: the EUROMICRO Journal
Software—Practice & Experience
Sunflower: full-system, embedded, microarchitecture evaluation
HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
Improving chip multiprocessor reliability through code replication
Computers and Electrical Engineering
Configurable virtual platform environment using SID simulator and Eclipse
SEUS'07 Proceedings of the 5th IFIP WG 10.2 international conference on Software technologies for embedded and ubiquitous systems
On-chip communication and synchronization mechanisms with cache-integrated network interfaces
Proceedings of the 7th ACM international conference on Computing frontiers
Exploit temporal locality of shared data in SRC enabled CMP
NPC'07 Proceedings of the 2007 IFIP international conference on Network and parallel computing
NPC'07 Proceedings of the 2007 IFIP international conference on Network and parallel computing
Directory-based conflict detection in hardware transactional memory
HiPC'08 Proceedings of the 15th international conference on High performance computing
Fault-tolerant cache coherence protocols for CMPs: evaluation and trade-offs
HiPC'08 Proceedings of the 15th international conference on High performance computing
LRU-PEA: a smart replacement policy for non-uniform cache architectures on chip multiprocessors
ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
A session key caching and prefetching scheme for secure communication in cluster systems
Journal of Parallel and Distributed Computing
Cache topology aware computation mapping for multicores
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
μπ: a scalable and transparent system for simulating MPI programs
Proceedings of the 3rd International ICST Conference on Simulation Tools and Techniques
ViPER: a lightweight approach to the simulation of distributed and embedded software
Proceedings of the 3rd International ICST Conference on Simulation Tools and Techniques
Data cache-energy and throughput models: design exploration for embedded processors
EURASIP Journal on Embedded Systems - Special issue on design and architectures for signal and image processing
The auction: optimizing banks usage in Non-Uniform Cache Architectures
Proceedings of the 24th ACM International Conference on Supercomputing
WiDGET: Wisconsin decoupled grid execution tiles
Proceedings of the 37th annual international symposium on Computer architecture
Forwardflow: a scalable core for power-constrained CMPs
Proceedings of the 37th annual international symposium on Computer architecture
Timetraveler: exploiting acyclic races for optimizing memory race recording
Proceedings of the 37th annual international symposium on Computer architecture
A case for FAME: FPGA architecture model execution
Proceedings of the 37th annual international symposium on Computer architecture
Proceedings of the 37th annual international symposium on Computer architecture
Ultra Fine-Grained Run-Time Power Gating of On-chip Routers for CMPs
NOCS '10 Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip
Performance Evaluation of a Multicore System with Optically Connected Memory Modules
NOCS '10 Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip
Cost-driven 3D integration with interconnect layers
Proceedings of the 47th Design Automation Conference
Virtual channels vs. multiple physical networks: a comparative analysis
Proceedings of the 47th Design Automation Conference
Automated modeling and emulation of interconnect designs for many-core chip multiprocessors
Proceedings of the 47th Design Automation Conference
Proceedings of the 47th Design Automation Conference
RAMP gold: an FPGA-based architecture simulator for multiprocessors
Proceedings of the 47th Design Automation Conference
Token tenure and PATCH: A predictive/adaptive token-counting hybrid
ACM Transactions on Architecture and Code Optimization (TACO)
A practical way to extend shared memory support beyond a motherboard at low cost
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Subspace snooping: filtering snoops with operating system support
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Proximity coherence for chip multiprocessors
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
SPACE: sharing pattern-based directory coherence for multicore scalability
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
AKULA: a toolset for experimenting and developing thread placement algorithms on multicore systems
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Handling the problems and opportunities posed by multiple on-chip memory controllers
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
SWEL: hardware cache coherence protocols to map shared data onto shared caches
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Compiler-assisted data distribution for chip multiprocessors
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
A new TCB cache to efficiently manage TCP sessions for web servers
Proceedings of the 6th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
Group-caching for NoC based multicore cache coherent systems
Proceedings of the Conference on Design, Automation and Test in Europe
Adaptive prefetching for shared cache based chip multiprocessors
Proceedings of the Conference on Design, Automation and Test in Europe
High-level design and validation of the BlueSPARC multithreaded processor
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems - Special section on the ACM IEEE international conference on formal methods and models for codesign (MEMOCODE) 2009
Quality of service shared cache management in chip multiprocessor architecture
ACM Transactions on Architecture and Code Optimization (TACO)
Understanding the behavior and implications of context switch misses
ACM Transactions on Architecture and Code Optimization (TACO)
An adaptive cache coherence protocol for chip multiprocessors
Proceedings of the Second International Forum on Next-Generation Multicore/Manycore Technologies
LV*: a class of lazy versioning HTMs for low-cost integration of transactional memory systems
Proceedings of the Second International Forum on Next-Generation Multicore/Manycore Technologies
IDEAL'10 Proceedings of the 11th international conference on Intelligent data engineering and automated learning
On-Chip Network Evaluation Framework
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
CPM in CMPs: Coordinated Power Management in Chip-Multiprocessors
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Simple but Effective Heterogeneous Main Memory with On-Chip Memory Controller Support
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Power-efficient spilling techniques for chip multiprocessors
EuroPar'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part I
TPCC-UVa: an open-source TPC-C implementation for parallel and distributed systems
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Enhancing L2 organization for CMPs with a center cell
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Coterminous locality and coterminous group data prefetching on chip-multiprocessors
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Optimizing power and performance for reliable on-chip networks
Proceedings of the 2010 Asia and South Pacific Design Automation Conference
Slack redistribution for graceful degradation under voltage overscaling
Proceedings of the 2010 Asia and South Pacific Design Automation Conference
Architectural support for thread communications in multi-core processors
Parallel Computing
Efficient dynamic program monitoring on multi-core systems
Journal of Systems Architecture: the EUROMICRO Journal
An analytical network performance model for SIMD processor CSX600 interconnects
Journal of Systems Architecture: the EUROMICRO Journal
Thread criticality support in on-chip networks
Proceedings of the Third International Workshop on Network on Chip Architectures
A variable-pipeline on-chip router optimized to traffic pattern
Proceedings of the Third International Workshop on Network on Chip Architectures
Process scheduling for future multicore processors
Proceedings of the Fifth International Workshop on Interconnection Network Architecture: On-Chip, Multi-Chip
A power-efficient network on-chip topology
Proceedings of the Fifth International Workshop on Interconnection Network Architecture: On-Chip, Multi-Chip
Tolerating Concurrency Bugs Using Transactions as Lifeguards
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Architectural Support for Fair Reader-Writer Locking
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Synergistic TLBs for High Performance Address Translation in Chip Multiprocessors
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Elastic Refresh: Techniques to Mitigate Refresh Penalties in High Density Memory
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Pseudo-Circuit: Accelerating Communication for On-Chip Interconnection Networks
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Adaptive Flow Control for Robust Performance and Energy
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Fractal Coherence: Scalably Verifiable Cache Coherence
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Hardware Support for Relaxed Concurrency Control in Transactional Memory
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
A Dynamically Adaptable Hardware Transactional Memory
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
COREMU: a scalable and portable parallel full-system emulator
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Fast modeling of shared caches in multicore systems
Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
Virtualizing network-on-chip resources in chip-multiprocessors
Microprocessors & Microsystems
Location cache design and performance analysis for chip multiprocessors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Efficient processor support for DRFx, a memory model with exceptions
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
RAFT: A router architecture with frequency tuning for on-chip networks
Journal of Parallel and Distributed Computing
Emulation-based transient thermal modeling of 2D/3D systems-on-chip with active cooling
Microelectronics Journal
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Exploiting dynamic micro-architecture usage in gate sizing
Microprocessors & Microsystems
Efficient partial roll-backing mechanism for transactional memory systems
Transactions on high-performance embedded architectures and compilers III
CReAMS: an embedded multiprocessor platform
ARC'11 Proceedings of the 7th international conference on Reconfigurable computing: architectures, tools and applications
Towards an adaptable multiple-ISA reconfigurable processor
ARC'11 Proceedings of the 7th international conference on Reconfigurable computing: architectures, tools and applications
Journal of Parallel and Distributed Computing
International Journal of Reconfigurable Computing - Special issue on selected papers from the 17th reconfigurable architectures workshop (RAW2010)
International Journal of Parallel Programming
A case for an SC-preserving compiler
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
METE: meeting end-to-end QoS in multicores through system-wide resource management
Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Studying the impact of hardware prefetching and bandwidth partitioning in chip-multiprocessors
Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Transactional conflict decoupling and value prediction
Proceedings of the international conference on Supercomputing
Multiset signatures for transactional memory
Proceedings of the international conference on Supercomputing
ZEBRA: a data-centric, hybrid-policy hardware transactional memory design
Proceedings of the international conference on Supercomputing
SecureME: a hardware-software approach to full system security
Proceedings of the international conference on Supercomputing
Predictive coordination of multiple on-chip resources for chip multiprocessors
Proceedings of the international conference on Supercomputing
Karma: scalable deterministic record-replay
Proceedings of the international conference on Supercomputing
A vertical bubble flow network using inductive-coupling for 3-D CMPs
NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
A distributed and topology-agnostic approach for on-line NoC testing
NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
Inferring packet dependencies to improve trace based simulation of on-chip networks
NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
Delay analysis of wormhole based heterogeneous NoC
NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
Increasing the effectiveness of directory caches by deactivating coherence for private memory blocks
Proceedings of the 38th annual international symposium on Computer architecture
i-NVMM: a secure non-volatile main memory system with incremental encryption
Proceedings of the 38th annual international symposium on Computer architecture
An abacus turn model for time/space-efficient reconfigurable routing
Proceedings of the 38th annual international symposium on Computer architecture
METE: meeting end-to-end QoS in multicores through system-wide resource management
ACM SIGMETRICS Performance Evaluation Review - Performance evaluation review
Studying the impact of hardware prefetching and bandwidth partitioning in chip-multiprocessors
ACM SIGMETRICS Performance Evaluation Review - Performance evaluation review
PADS '10 Proceedings of the 2010 IEEE Workshop on Principles of Advanced and Distributed Simulation
BarrierWatch: characterizing multithreaded workloads across and within program-defined epochs
Proceedings of the 8th ACM International Conference on Computing Frontiers
An energy-efficient adaptive hybrid cache
Proceedings of the 17th IEEE/ACM international symposium on Low-power electronics and design
Proceedings of the 17th IEEE/ACM international symposium on Low-power electronics and design
Improving energy efficiency of multi-threaded applications using heterogeneous CMOS-TFET multicores
Proceedings of the 17th IEEE/ACM international symposium on Low-power electronics and design
NoC frequency scaling with flexible-pipeline routers
Proceedings of the 17th IEEE/ACM international symposium on Low-power electronics and design
Dynamic access distance driven cache replacement
ACM Transactions on Architecture and Code Optimization (TACO)
Evaluating placement policies for managing capacity sharing in CMP architectures with private caches
ACM Transactions on Architecture and Code Optimization (TACO)
A high-parallelism distributed scheduling mechanism for multi-core instruction-set simulation
Proceedings of the 48th Design Automation Conference
An energy-efficient heterogeneous CMP based on hybrid TFET-CMOS cores
Proceedings of the 48th Design Automation Conference
A helper thread based dynamic cache partitioning scheme for multithreaded applications
Proceedings of the 48th Design Automation Conference
A reuse-aware prefetching scheme for scratchpad memory
Proceedings of the 48th Design Automation Conference
Evaluation of low-overhead organizations for the directory in future many-core CMPs
Euro-Par 2010 Proceedings of the 2010 conference on Parallel processing
Multilayer cache partitioning for multiprogram workloads
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Filtering directory lookups in CMPs with write-through caches
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Bandwidth constrained coordinated HW/SW prefetching for multicores
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Unified locality-sensitive signatures for transactional memory
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Optimal memory controller placement for chip multiprocessor
CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Hardware performance monitoring for the rest of us: a position and survey
NPC'11 Proceedings of the 8th IFIP international conference on Network and parallel computing
HPC-Mesh: A Homogeneous Parallel Concentrated Mesh for Fault-Tolerance and Energy Savings
Proceedings of the 2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems
A study of 3D Network-on-Chip design for data parallel H.264 coding
Microprocessors & Microsystems
Compiler support for concurrency synchronization
ICA3PP'11 Proceedings of the 11th international conference on Algorithms and architectures for parallel processing - Volume Part I
A minimal average accessing time scheduler for multicore processors
ICA3PP'11 Proceedings of the 11th international conference on Algorithms and architectures for parallel processing - Volume Part II
Filtering directory lookups in CMPs
Microprocessors & Microsystems
Reconfigurable interconnects in DSM systems: a focus on context switch behavior
ISPA'06 Proceedings of the 2006 international conference on Frontiers of High Performance Computing and Networking
ABS: A low-cost adaptive controller for prefetching in a banked shared last-level cache
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Hardware transactional memory with software-defined conflicts
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
On the simulation of large-scale architectures using multiple application abstraction levels
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
ReNIC: Architectural extension to SR-IOV I/O virtualization for efficient replication
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
The migration prefetcher: Anticipating data promotion in dynamic NUCA caches
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Hardware budget and runtime system for data-driven multithreaded chip multiprocessor
ACSAC'06 Proceedings of the 11th Asia-Pacific conference on Advances in Computer Systems Architecture
The Java Virtual Machine in retargetable, high-performance instruction set simulation
Proceedings of the 9th International Conference on Principles and Practice of Programming in Java
Switch-based packing technique to reduce traffic and latency in token coherence
Journal of Parallel and Distributed Computing
Bandwidth-aware reconfigurable cache design with hybrid memory technologies
Proceedings of the International Conference on Computer-Aided Design
Feedback control based cache reliability enhancement for emerging multicores
Proceedings of the International Conference on Computer-Aided Design
Improving shared cache behavior of multithreaded object-oriented applications in multicores
Proceedings of the International Conference on Computer-Aided Design
Considering network context for efficient simulation of highly parallel network processors
ICCNMC'05 Proceedings of the Third international conference on Networking and Mobile Computing
INSEE: an interconnection network simulation and evaluation environment
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Analyzing advanced PDE solvers through simulation
PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
Simulation-based analysis of parallel runge-kutta solvers
PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
Efficient memory management of a hierarchical and a hybrid main memory for MN-MATE platform
Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores
RAPANUI: rapid prototyping for media processor architecture exploration
SAMOS'05 Proceedings of the 5th international conference on Embedded Computer Systems: architectures, Modeling, and Simulation
Ultra fast cycle-accurate compiled emulation of inorder pipelined architectures
SAMOS'05 Proceedings of the 5th international conference on Embedded Computer Systems: architectures, Modeling, and Simulation
Hardware support for OpenMP collective operations
LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
Minimalist open-page: a DRAM page-mode scheduling policy for the many-core era
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
FeatherWeight: low-cost optical arbitration with QoS support
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
A data layout optimization framework for NUCA-based multicores
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Full system simulation of many-core heterogeneous SoCs using GPU and QEMU semihosting
Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units
An Extended SystemC Framework for Efficient HW/SW Co-Simulation
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Performance/Thermal-Aware Design of 3D-Stacked L2 Caches for CMPs
ACM Transactions on Design Automation of Electronic Systems (TODAES)
A universal parallel front-end for execution driven microarchitecture simulation
Proceedings of the 2012 Workshop on Rapid Simulation and Performance Evaluation: Methods and Tools
Communication based proactive link power management
Transactions on High-Performance Embedded Architectures and Compilers IV
Improving performance by reducing aborts in hardware transactional memory
HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
Efficient transaction nesting in hardware transactional memory
ARCS'10 Proceedings of the 23rd international conference on Architecture of Computing Systems
JetBench: an open source real-time multiprocessor benchmark
ARCS'10 Proceedings of the 23rd international conference on Architecture of Computing Systems
Is reuse distance applicable to data locality analysis on chip multiprocessors?
CC'10/ETAPS'10 Proceedings of the 19th joint European conference on Theory and Practice of Software, international conference on Compiler Construction
Extending a multicore multithread simulator to model power-aware hard real-time systems
ICA3PP'10 Proceedings of the 10th international conference on Algorithms and Architectures for Parallel Processing - Volume Part II
First steps towards the certification of an ARM simulator using compcert
CPP'11 Proceedings of the First international conference on Certified Programs and Proofs
Improving server performance on multi-cores via selective off-loading of OS functionality
ISCA'10 Proceedings of the 2010 international conference on Computer Architecture
New memory organizations for 3d DRAM and PCMs
ARCS'12 Proceedings of the 25th international conference on Architecture of Computing Systems
Neighborhood-aware data locality optimization for NoC-based multicores
CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
Reliability-aware platform optimization for 3D chip multi-processors
The Journal of Supercomputing
Network-on-Chip virtualization in Chip-Multiprocessor Systems
Journal of Systems Architecture: the EUROMICRO Journal
On the interfacing between QEMU and SystemC for virtual platform construction: Using DMA as a case
Journal of Systems Architecture: the EUROMICRO Journal
Reliability-aware core partitioning in chip multiprocessors
Journal of Systems Architecture: the EUROMICRO Journal
A dual-phase compression mechanism for hybrid DRAM/PCM main memory architectures
Proceedings of the great lakes symposium on VLSI
SnCTM: reducing false transaction aborts by adaptively changing the source of conflict detection
Proceedings of the 9th conference on Computing Frontiers
Reuse distance based performance modeling and workload mapping
Proceedings of the 9th conference on Computing Frontiers
Transformer: a functional-driven cycle-accurate multicore simulator
Proceedings of the 49th Annual Design Automation Conference
Courteous cache sharing: being nice to others in capacity management
Proceedings of the 49th Annual Design Automation Conference
Architecture support for accelerator-rich CMPs
Proceedings of the 49th Annual Design Automation Conference
Exploration of heuristic scheduling algorithms for 3D multicore processors
Proceedings of the 15th International Workshop on Software and Compilers for Embedded Systems
Fast architecture evaluation of heterogeneous MPSoCs by host-compiled simulation
Proceedings of the 15th International Workshop on Software and Compilers for Embedded Systems
A greedy heuristic approximation scheduling algorithm for 3d multicore processors
Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing
Study of hierarchical n-body methods for network-on-chip architectures
Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing - Volume 2
ASCIB: adaptive selection of cache indexing bits for removing conflict misses
Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design
Energy-efficient non-minimal path on-chip interconnection network for heterogeneous systems
Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design
Power-aware performance increase via core/uncore reinforcement control for chip-multiprocessors
Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design
A software approach for combating asymmetries of non-volatile memories
Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design
BiN: a buffer-in-NUCA scheme for accelerator-rich CMPs
Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design
Static and dynamic co-optimizations for blocks mapping in hybrid caches
Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design
Designing for dark silicon: a methodological perspective on energy efficient systems
Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design
CHARM: a composable heterogeneous accelerator-rich microprocessor
Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design
Something old and something new: P-states can borrow microarchitecture techniques too
Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design
Revisiting hardware-assisted page walks for virtualized systems
Proceedings of the 39th Annual International Symposium on Computer Architecture
A new degree of freedom for memory allocation in clusters
Cluster Computing
Instant Multiunit Resource Hardware Deadlock Detection Scheme for System-on-Chips
ACM Transactions on Embedded Computing Systems (TECS)
MCEmu: A Framework for Software Development and Performance Analysis of Multicore Systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Thread vulnerability in parallel applications
Journal of Parallel and Distributed Computing
TACHYON: tandem execution for efficient live patch testing
Security'12 Proceedings of the 21st USENIX conference on Security symposium
Design and implementation of a general-purpose MCU simulation software
ICIC'12 Proceedings of the 8th international conference on Intelligent Computing Theories and Applications
Analyzing performance and power efficiency of network processing over 10 GbE
Journal of Parallel and Distributed Computing
Efficient implementation of globally-aware network flow control
Journal of Parallel and Distributed Computing
PEPON: performance-aware hierarchical power budgeting for NoC based multicores
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
APCR: an adaptive physical channel regulator for on-chip interconnects
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Practically private: enabling high performance CMPs through compiler-assisted data classification
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Complexity-effective multicore coherence
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Multi2Sim: a simulation framework for CPU-GPU computing
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Base-delta-immediate compression: practical data compression for on-chip caches
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Acceleration of bulk memory operations in a heterogeneous multicore architecture
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
When less is more (LIMO):controlled parallelism forimproved efficiency
Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systems
A real-time, energy-efficient system software suite for heterogeneous multicore platforms
Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
A novel NoC-based design for fault-tolerance of last-level caches in CMPs
Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Performance enhancement under power constraints using heterogeneous CMOS-TFET multicores
Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Starvation-free transactional memory-system protocols
Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Implicit transactional memory in kilo-instruction multiprocessors
ACSAC'07 Proceedings of the 12th Asia-Pacific conference on Advances in Computer Systems Architecture
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Exploiting reuse locality on inclusive shared last-level caches
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
An integrated pseudo-associativity and relaxed-order approach to hardware transactional memory
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Delta-compressed caching for overcoming the write bandwidth limitation of hybrid main memory
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
SCIN-cache: Fast speculative versioning in multithreaded cores
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Stream arbitration: Towards efficient bandwidth utilization for emerging on-chip interconnects
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
A high-efficiency low-cost heterogeneous 3D network-on-chip design
Proceedings of the Fifth International Workshop on Network on Chip Architectures
CRAW/P: a workload partition method for the efficient parallel simulation of manycores
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Bus and memory protection through chain-generated and tree-verified IV for multiprocessors systems
Future Generation Computer Systems
Simsys: a performance simulation framework
Proceedings of the 2013 Workshop on Rapid Simulation and Performance Evaluation: Methods and Tools
A distributed timing synchronization technique for parallel multi-core instruction-set simulation
ACM Transactions on Embedded Computing Systems (TECS) - Special section on ESTIMedia'12, LCTES'11, rigorous embedded systems design, and multiprocessor system-on-chip for cyber-physical systems
Heracles: a tool for fast RTL-based design space exploration of multicore processors
Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Co-simulation framework of SystemC SoC virtual prototype and custom logic (abstract only)
Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Energy efficient caching for phase-change memory
MedAlg'12 Proceedings of the First Mediterranean conference on Design and Analysis of Algorithms
Checkpointing SystemC-Based Virtual Platforms
International Journal of Embedded and Real-Time Communication Systems
Towards a multiple-ISA embedded system
Journal of Systems Architecture: the EUROMICRO Journal
A multi-core memory organization for 3-d DRAM as main memory
ARCS'13 Proceedings of the 26th international conference on Architecture of Computing Systems
ARCS'13 Proceedings of the 26th international conference on Architecture of Computing Systems
DeNovoND: efficient hardware support for disciplined non-determinism
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Cyrus: unintrusive application-level record-replay for replay parallelism
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Wait-n-GoTM: improving HTM performance by serializing cyclic dependencies
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Leveraging Heterogeneity in DRAM Main Memories to Accelerate Critical Word Access
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
NoRD: Node-Router Decoupling for Effective Power-gating of On-Chip Routers
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Addressing End-to-End Memory Access Latency in NoC-Based Multicores
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Predicting Coherence Communication by Tracking Synchronization Points at Run Time
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Replacement techniques for dynamic NUCA cache designs on CMPs
The Journal of Supercomputing
A survey on cache tuning from a power/energy perspective
ACM Computing Surveys (CSUR)
Testing large-scale cloud management
IBM Journal of Research and Development
Proceedings of the ACM International Conference on Computing Frontiers
Boosting instruction set simulator performance with parallel block optimisation and replacement
ACSC '12 Proceedings of the Thirty-fifth Australasian Computer Science Conference - Volume 122
Extracting useful computation from error-prone processors for streaming applications
Proceedings of the Conference on Design, Automation and Test in Europe
Proceedings of the Conference on Design, Automation and Test in Europe
Cache coherence enabled adaptive refresh for volatile STT-RAM
Proceedings of the Conference on Design, Automation and Test in Europe
Modeling and analysis of fault-tolerant distributed memories for networks-on-chip
Proceedings of the Conference on Design, Automation and Test in Europe
Bit mapping for balanced PCM cell programming
Proceedings of the 40th Annual International Symposium on Computer Architecture
A new perspective for efficient virtual-cache coherence
Proceedings of the 40th Annual International Symposium on Computer Architecture
Protozoa: adaptive granularity cache coherence
Proceedings of the 40th Annual International Symposium on Computer Architecture
Proceedings of the 40th Annual International Symposium on Computer Architecture
A complete self-testing and self-configuring NoC infrastructure for cost-effective MPSoCs
ACM Transactions on Embedded Computing Systems (TECS) - Special Section on Wireless Health Systems, On-Chip and Off-Chip Network Architectures
A flexible simulation framework for multicore schedulers: work in progress paper
Proceedings of the 2013 ACM SIGSIM conference on Principles of advanced discrete simulation
Efficiently tolerating timing violations in pipelined microprocessors
Proceedings of the 50th Annual Design Automation Conference
Checkpointing for virtual platforms and SystemC-TLM
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Architecturally homogeneous power-performance heterogeneous multicore systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Exploring the vulnerability of CMPs to soft errors with 3D stacked nonvolatile memory
ACM Journal on Emerging Technologies in Computing Systems (JETC)
Dynamically reconfigurable hybrid cache: an energy-efficient last-level cache design
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Performance-reliability tradeoff analysis for multithreaded applications
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
International Journal of High Performance Computing Applications
MobiSIM: a simulation library for resource prediction of smartphones and wireless sensor networks
Proceedings of the 46th Annual Simulation Symposium
Ordering circuit establishment in multiplane NoCs
ACM Transactions on Design Automation of Electronic Systems (TODAES) - Special Section on Networks on Chip: Architecture, Tools, and Methodologies
Optimal placement of vertical connections in 3D Network-on-Chip
Journal of Systems Architecture: the EUROMICRO Journal
Leveraging bandwidth improvements to web servers through enhanced network interfaces
The Journal of Supercomputing
OASIS: on achieving a sanctuary for integrity and secrecy on untrusted platforms
Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security
MMSoC: a multi-layer multi-core storage-on-chip design for systems with high integration
Proceedings of the 14th International Conference on Computer Systems and Technologies
Optimized multicore architectures for data parallel fast Fourier transform
Proceedings of the 14th International Conference on Computer Systems and Technologies
A methodology for testing CPU emulators
ACM Transactions on Software Engineering and Methodology (TOSEM) - Testing, debugging, and error handling, formal methods, lifecycle concerns, evolution and maintenance
Meeting midway: improving CMP performance with memory-side prefetching
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
McRouter: multicast within a router for high performance network-on-chips
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Low-energy volatile STT-RAM cache design using cache-coherence-enabled adaptive refresh
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Direct code execution: revisiting library OS architecture for reproducible network experiments
Proceedings of the ninth ACM conference on Emerging networking experiments and technologies
Linearly compressed pages: a low-complexity, low-latency main memory compression framework
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
The reuse cache: downsizing the shared last-level cache
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Multi-grain coherence directories
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
A circuit-architecture co-optimization framework for exploring nonvolatile memory hierarchies
ACM Transactions on Architecture and Code Optimization (TACO)
Modeling the impact of permanent faults in caches
ACM Transactions on Architecture and Code Optimization (TACO)
High-performance fractal coherence
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
C1C: A configurable, compiler-guided STT-RAM L1 cache
ACM Transactions on Architecture and Code Optimization (TACO)
Palirria: Accurate On-line Parallelism Estimation for Adaptive Work-Stealing
Proceedings of Programming Models and Applications on Multicores and Manycores
Thread-criticality aware dynamic cache reconfiguration in multi-core system
Proceedings of the International Conference on Computer-Aided Design
System performance evaluation by combining RTC and VHDL simulation: A case study on NICs
Journal of Systems Architecture: the EUROMICRO Journal
Efficient execution of speculative threads and transactions with hardware transactional memory
Future Generation Computer Systems
Dual partitioning multicasting for high-performance on-chip networks
Journal of Parallel and Distributed Computing
Performance and power profiling for emulated Android systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
PAIS: Parallelism-aware interconnect scheduling in multicores
ACM Transactions on Embedded Computing Systems (TECS) - Special Issue on Design Challenges for Many-Core Processors, Special Section on ESTIMedia'13 and Regular Papers
NoC-based fault-tolerant cache design in chip multiprocessors
ACM Transactions on Embedded Computing Systems (TECS) - Special Issue on Design Challenges for Many-Core Processors, Special Section on ESTIMedia'13 and Regular Papers
System-level impacts of persistent main memory using a search engine
Microelectronics Journal
Dynamic thread mapping of shared memory applications by exploiting cache coherence protocols
Journal of Parallel and Distributed Computing
Supporting faulty banks in NUCA by NoC assisted remapping mechanisms
The Journal of Supercomputing
HMTT: A hybrid hardware/software tracing system for bridging the DRAM access trace's semantic gap
ACM Transactions on Architecture and Code Optimization (TACO)
Removal of Conflicts in Hardware Transactional Memory Systems
International Journal of Parallel Programming
Energy and throughput aware fuzzy logic based reconfiguration for MPSoCs
Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology
Bandwidth Adaptive Cache Coherence Optimizations for Chip Multiprocessors
International Journal of Parallel Programming
Hi-index | 4.11 |
Full system simulation seeks to strike a balance between accuracy and performance. Many of its possibilities have been obvious to practitioners in both academia and industry for quite some time, perhaps decades, but Simics supports more of these possibilities within a single framework than other tools do.Simics is a platform for full system simulation that can run actual firmware and completely unmodified kernel and driver code. It is sufficiently abstract to achieve tolerable performance levels, and it provides both functional accuracy for running commercial workloads and sufficient timing accuracy to interface to detailed hardware models. Simics can also run a heterogeneous network of systems from different vendors within the same framework. Exceptionally fast, Simics can easily add new components and leverage older ones within a practical abstraction level. It offers a platform with a rich API and a powerful scripting environment for use in a broad range of applications.