Transactional memory: architectural support for lock-free data structures
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
The program structure tree: computing control regions in linear time
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Advanced compiler design and implementation
Advanced compiler design and implementation
Removing unnecessary synchronization in Java
Proceedings of the 14th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Ultra-fast aliasing analysis using CLA: a million lines of C code in a second
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Pointer and escape analysis for multithreaded programs
PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Dependence Analysis
Static Analysis of Barrier Synchronization in Explicitly Parallel Programs
PACT '94 Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques
Associating synchronization constraints with data in an object-oriented language
Conference record of the 33rd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Effective static race detection for Java
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Component-Based Lock Allocation
PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
Colorama: Architectural Support for Data-Centric Synchronization
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
FastTrack: efficient and precise dynamic race detection
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Stretching transactional memory
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Abstraction-guided synthesis of synchronization
Proceedings of the 37th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
AtomTracker: A Comprehensive Approach to Atomic Region Inference and Violation Detection
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Lock inference in the presence of large libraries
ECOOP'12 Proceedings of the 26th European conference on Object-Oriented Programming
Hi-index | 0.00 |
There is a huge body of sequential legacy code that needs to be refactored for multicore processors. Especially for control code for embedded systems it is often easy to split the program into multiple threads. But it is difficult to identify critical sections to avoid data races as the legacy code hides its synchronization in a static schedule, priorities and interrupts. To ease refactoring, this paper presents a new static data-dependence analysis that identifies necessary critical sections in thread-parallel code that does not yet contain any synchronization between threads. A novel optimization pass then breaks up and shrinks the identified critical sections to maximize parallelism while preserving correctness. Our technique proved to be successful in refactoring sequential assembly-like legacy codes in an industry-sponsored project. But as refactoring projects are hard to evaluate quantitatively and as the domain specific low-level language is of limited interest, we use a standard benchmark suite for which the optimum, i.e., the minimal set of the necessary atomic block annotations is known. We removed the annotations and let the compiler attempt to rediscover them. For 5 out of 7 benchmarks, our compiler identified the same critical sections as the original programmers did by hand. For the other two benchmarks, the compiler found slightly larger (but also correct) critical sections. In all cases, the versions of the benchmarks that the compiler annotated achieved the original run-time performance.