Transactional memory: architectural support for lock-free data structures
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Language support for lightweight transactions
OOPSLA '03 Proceedings of the 18th annual ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applications
TAPE: a transactional application profiling environment
Proceedings of the 19th annual international conference on Supercomputing
Optimizing memory transactions
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
Performance pathologies in hardware transactional memory
Proceedings of the 34th annual international symposium on Computer architecture
Privatization techniques for software transactional memory
Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
A Study of a Transactional Parallel Routing Algorithm
PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
A Scalable, Non-blocking Approach to Transactional Memory
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Proceedings of the 5th conference on Computing frontiers
Kicking the tires of software transactional memory: why the going gets tough
Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
Design and implementation of transactional constructs for C/C++
Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications
Ordering-Based Semantics for Software Transactional Memory
OPODIS '08 Proceedings of the 12th International Conference on Principles of Distributed Systems
Atomic quake: using transactional memory in an interactive multiplayer game server
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Early experience with a commercial hardware transactional memory implementation
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
WormBench: a configurable workload for evaluating transactional memory systems
Proceedings of the 9th workshop on MEmory performance: DEaling with Applications, systems and architecture
QuakeTM: parallelizing a complex sequential application using transactional memory
Proceedings of the 23rd international conference on Supercomputing
Profiling Transactional Memory Applications
PDP '09 Proceedings of the 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing
tm_db: A Generic Debugging Library for Transactional Programs
PACT '09 Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques
Understanding the behavior of transactional memory applications
Proceedings of the 7th Workshop on Parallel and Distributed Systems: Testing, Analysis, and Debugging
EazyHTM: eager-lazy hardware transactional memory
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Is transactional programming actually easier?
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Debugging programs that use atomic blocks and transactional memory
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
New abstractions for effective performance analysis of STM programs
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Transactional Memory, 2nd Edition
Transactional Memory, 2nd Edition
DISC'06 Proceedings of the 20th international conference on Distributed Computing
Hardware transactional memory for GPU architectures
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
The runtime abort graph and its application to software transactional memory optimization
CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
Reconciling transactional conflicts with compiler's help
Proceedings of the Tenth International Symposium on Code Generation and Optimization
Capturing transactional memory application's behavior --- the prerequisite for performance analysis
MSEPT'12 Proceedings of the 2012 international conference on Multicore Software Engineering, Performance, and Tools
Visualizing transactional memory
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Safe compiler-driven transaction checkpointing and recovery
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
TagTM - accelerating STMs with hardware tags for fast meta-data access
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Evaluation of two formulations of the conjugate gradients method with transactional memory
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
Profile-guided transaction coalescing—lowering transactional overheads by merging transactions
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
Many researchers have developed applications using transactionalmemory (TM) with the purpose of benchmarking different implementations, and studying whether or not TM is easy to use. However, comparatively little has been done to provide general-purpose tools for profiling and tuning programs which use transactions. In this paper we introduce a series of profiling techniques for TM applications that provide in-depth and comprehensive information about the wasted work caused by aborting transactions. We explore three directions: (i) techniques to identify multiple potential conflicts from a single program run, (ii) techniques to identify the data structures involved in conflicts by using a symbolic path through the heap, rather than a machine address, and (iii) visualization techniques to summarize how threads spend their time and which of their transactions conflict most frequently. To examine the effectiveness of the profiling techniques, we provide a series of illustrations from the STAMP TM benchmark suite and from the synthetic WormBench workload. We show how to use our profiling techniques to optimize the performance of the Bayes, Labyrinth and Intruder applications. We discuss the design and implementation of our techniques in the Bartok-STM system. We process data offline or during garbage collection, where possible, in order to minimize the probe effect introduced by profiling.