Transactional memory: architectural support for lock-free data structures
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Coherent network interfaces for fine-grain communication
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Eraser: a dynamic data race detector for multithreaded programs
ACM Transactions on Computer Systems (TOCS)
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
A static analyzer for finding dynamic programming errors
Software—Practice & Experience
Dynamically Discovering Likely Program Invariants to Support Program Evolution
IEEE Transactions on Software Engineering - Special issue on 1999 international conference on software engineering
Extended static checking for Java
PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
A "flight data recorder" for enabling full-system multiprocessor deterministic replay
Proceedings of the 30th annual international symposium on Computer architecture
DISE: a programmable macro engine for customizing applications
Proceedings of the 30th annual international symposium on Computer architecture
Secure program execution via dynamic information flow tracking
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Minos: Control Data Attack Prevention Orthogonal to Memory Model
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Efficient and flexible architectural support for dynamic monitoring
ACM Transactions on Architecture and Code Optimization (TACO)
Pin: building customized program analysis tools with dynamic instrumentation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
BugNet: Continuously Recording Program Execution for Deterministic Replay Debugging
Proceedings of the 32nd annual international symposium on Computer Architecture
Efficient, transparent, and comprehensive runtime code manipulation
Efficient, transparent, and comprehensive runtime code manipulation
HeapMon: a helper-thread approach to programmable, automatic, and low-overhead memory bug detection
IBM Journal of Research and Development
A regulated transitive reduction (RTR) for longer memory race recording
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Recording shared memory dependencies using strata
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Log-based architectures for general-purpose monitoring of deployed code
Proceedings of the 1st workshop on Architectural and system support for improving software dependability
LIFT: A Low-Overhead Practical Information Flow Tracking System for Detecting Security Attacks
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Raksha: a flexible information flow architecture for software security
Proceedings of the 34th annual international symposium on Computer architecture
Valgrind: a framework for heavyweight dynamic binary instrumentation
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Checking system rules using system-specific, programmer-written compiler extensions
OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
How to shadow every byte of memory used by a program
Proceedings of the 3rd international conference on Virtual execution environments
Replay debugging for distributed applications
ATEC '06 Proceedings of the annual conference on USENIX '06 Annual Technical Conference
MemTracker: Efficient and Programmable Support for Memory Access Monitoring and Debugging
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
HARD: Hardware-Assisted Lockset-based Race Detection
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Parallelizing security checks on commodity hardware
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Parallelizing dynamic information flow tracking
Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
Rerun: Exploiting Episodes for Lightweight Memory Race Recording
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Flexible Hardware Acceleration for Instruction-Grain Program Monitoring
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
The PARSEC benchmark suite: characterization and architectural implications
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Capo: a software-hardware interface for practical deterministic multiprocessor replay
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Architectural support for shadow memory in multiprocessors
Proceedings of the 2009 ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Ordering decoupled metadata accesses in multiprocessors
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Architecting a chunk-based memory race recorder in modern CMPs
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Butterfly analysis: adapting dataflow analysis to dynamic parallel monitoring
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Butterfly analysis: adapting dataflow analysis to dynamic parallel monitoring
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Decoupled lifeguards: enabling path optimizations for dynamic correctness checking tools
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Log-based architectures: using multicore to help software behave correctly
ACM SIGOPS Operating Systems Review
Improving software diagnosability via log enhancement
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
DoublePlay: parallelizing sequential logging and replay
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
DoublePlay: Parallelizing Sequential Logging and Replay
ACM Transactions on Computer Systems (TOCS) - Special Issue APLOS 2011
Improving Software Diagnosability via Log Enhancement
ACM Transactions on Computer Systems (TOCS) - Special Issue APLOS 2011
SuperCoP: a general, correct, and performance-efficient supervised memory system
Proceedings of the 9th conference on Computing Frontiers
Understanding the interleaving-space overlap across inputs and software versions
HotPar'12 Proceedings of the 4th USENIX conference on Hot Topics in Parallelism
MSEPT'12 Proceedings of the 2012 international conference on Multicore Software Engineering, Performance, and Tools
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Parallelizing data race detection
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
When is multi-version checkpointing needed?
Proceedings of the 3rd Workshop on Fault-tolerance for HPC at extreme scale
Micro-architectural support for metadata coherence in multi-core dynamic information flow tracking
Proceedings of the 2nd International Workshop on Hardware and Architectural Support for Security and Privacy
Efficient concurrency-bug detection across inputs
Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications
Guardrail: a high fidelity approach to protecting hardware devices from buggy drivers
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Hi-index | 0.00 |
Instruction-grain lifeguards monitor the events of a running application at the level of individual instructions in order to identify and help mitigate application bugs and security exploits. Because such lifeguards impose a 10-100X slowdown on existing platforms, previous studies have proposed hardware designs to accelerate lifeguard processing. However, these accelerators are either tailored to a specific class of lifeguards or suitable only for monitoring singlethreaded programs. We present ParaLog, the first design of a system enabling fast online parallel monitoring of multithreaded parallel applications. ParaLog supports a broad class of software-defined lifeguards. We show how three existing accelerators can be enhanced to support online multithreaded monitoring, dramatically reducing lifeguard overheads. We identify and solve several challenges in monitoring parallel applications and/or parallelizing these accelerators, including (i) enforcing inter-thread data dependences, (ii) dealing with inter-thread effects that are not reflected in coherence traffic, (iii) dealing with unmonitored operating system activity, and (iv) ensuring lifeguards can access shared metadata with negligible synchronization overheads. We present our system design for both Sequentially Consistent and Total Store Ordering processors. We implement and evaluate our design on a 16 core simulated CMP, using benchmarks from SPLASH-2 and PARSEC and two lifeguards: a data-flow tracking lifeguard and a memory-access checker lifeguard. Our results show that (i) our parallel accelerators improve performance by 2-9X and 1.13-3.4X for our two lifeguards, respectively, (ii) we are 5-126X faster than the time-slicing approach required by existing techniques, and (iii) our average overheads for applications with eight threads are 51% and 28% for the two lifeguards, respectively.