Opportunities for concurrent dynamic analysis with explicit inter-core communication

Authors:
Jungwoo Ha;Stephen P. Crago
Affiliations:
University of Southern California, Los Angeles, CA and Information Sciences Institute East, Arlington, VA, USA;University of Southern California, Los Angeles, CA and Information Sciences Institute East, Arlington, VA, USA
Venue:
Proceedings of the 9th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering
Year:
2010

Citing 12
Cited 0

Adaptive optimization in the Jalapeño JVM

OOPSLA '00 Proceedings of the 15th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
A framework for reducing the cost of instrumented code

Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
iWatcher: Efficient Architectural Support for Software Debugging

Proceedings of the 31st annual international symposium on Computer architecture
Collecting and Exploiting High-Accuracy Call Graph Profiles in Virtual Machines

Proceedings of the international symposium on Code generation and optimization
Continuous Path and Edge Profiling

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
HeapMon: a helper-thread approach to programmable, automatic, and low-overhead memory bug detection

IBM Journal of Research and Development
AVIO: detecting atomicity violations via access interleaving invariants

Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Improved error reporting for software that uses black-box components

Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
The PARSEC benchmark suite: characterization and architectural implications

Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Laminar: practical fine-grained decentralized information flow control

Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
A concurrent dynamic analysis framework for multicore hardware

Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications
PACER: proportional detection of data races

PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multicore is now the dominant processor trend, and the number of cores is rapidly increasing. The paradigm shift to multicore forces the redesign of the software stack, which includes dynamic analysis. Dynamic analyses provide rich features to software in various areas, such as debugging, testing, optimization, and security. However, these techniques often suffer from excessive overhead, which make it less practical. Previously, this overhead has been overcome by improved processor performance as each generation gets faster, but the performance requirements of dynamic analyses in the multicore era cannot be fulfilled without redesigning for parallelism. Scalable design of dynamic analysis is a challenging problem. Not only must the analysis itself must be parallel, but the analysis must also be decoupled from the application and run concurrently. A typical method of decoupling the analysis from the application is to send the analysis data from the application to the core that runs the analysis thread via buffering. However, buffering can perturb application cache performance, and the cache coherence protocol may not be efficient, or even implemented, with large numbers of cores in the future. This paper presents our initial effort to explore the hardware design space and software approach that will alleviate the scalability problem for dynamic analysis on multicore. We choose to make use of explicit inter-core communication that is already available in a real processor, the TILE64 processor and evaluate the opportunity for scalable dynamic analyses. We provide our model and implement concurrent call graph profiling as a case study. Our evaluation shows that pure communication overhead from the application point of view is as low as 1%. We expect that our work will help design scalable dynamic analyses and will influence the design of future many-core processors.