Detection of false sharing using machine learning

Authors:
Sanath Jayasena;Saman Amarasinghe;Asanka Abeyweera;Gayashan Amarasinghe;Himeshi De Silva;Sunimal Rathnayake;Xiaoqiao Meng;Yanbin Liu
Affiliations:
University of Moratuwa, Sri Lanka;Massachusetts Institute of Technology, Cambridge;University of Moratuwa, Sri Lanka;University of Moratuwa, Sri Lanka;University of Moratuwa, Sri Lanka;University of Moratuwa, Sri Lanka;IBM Research, Yorktown Heights, New York;IBM Research, Yorktown Heights, New York
Venue:
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Year:
2013

Citing 25
Cited 1

C4.5: programs for machine learning

C4.5: programs for machine learning
Reducing false sharing on shared memory multiprocessors through compile time data transformations

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
An analysis of degenerate sharing and false coherence

Journal of Parallel and Distributed Computing
Hoard: a scalable memory allocator for multithreaded applications

ACM SIGPLAN Notices
False Sharing Elimination by Selection of Runtime Scheduling Parameters

ICPP '97 Proceedings of the international Conference on Parallel Processing
An Architecture-Independent Analysis of False Sharing

An Architecture-Independent Analysis of False Sharing
A Portable Programming Interface for Performance Evaluation on Modern Processors

International Journal of High Performance Computing Applications
Online performance analysis by statistical sampling of microprocessor performance counters

Proceedings of the 19th annual international conference on Supercomputing
Online optimizations driven by hardware performance monitoring

Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
False sharing and its effect on shared memory performance

Sedms'93 USENIX Systems on USENIX Experiences with Distributed and Multiprocessor Systems - Volume 4
Evaluating MapReduce for Multi-core and Multiprocessor Systems

HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Hardware counter driven on-the-fly request signatures

Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
What can performance counters do for memory subsystem analysis?

Proceedings of the 2008 ACM SIGPLAN workshop on Memory systems performance and correctness: held in conjunction with the Thirteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '08)
The WEKA data mining software: an update

ACM SIGKDD Explorations Newsletter
Phoenix rebirth: Scalable MapReduce on a large-scale shared-memory system

IISWC '09 Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC)
Automatic performance analysis with periscope

Concurrency and Computation: Practice & Experience - Scalable Tools for High-End Computing
HPCTOOLKIT: tools for performance analysis of optimized parallel programs http://hpctoolkit.org

Concurrency and Computation: Practice & Experience - Scalable Tools for High-End Computing
Umbra: efficient and scalable memory shadowing

Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
Assessing cache false sharing effects by dynamic binary instrumentation

Proceedings of the Workshop on Binary Instrumentation and Applications
PerfExpert: An Easy-to-Use Performance Diagnosis Tool for HPC Applications

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Dynamic cache contention detection in multi-threaded applications

Proceedings of the 7th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
SHERIFF: precise detection and automatic mitigation of false sharing

Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
CacheIn: a toolset for comprehensive cache inspection

ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part II
ADP: automated diagnosis of performance pathologies using hardware events

Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems
Latencies of conflicting writes on contemporary multicore architectures

PaCT'07 Proceedings of the 9th international conference on Parallel Computing Technologies

PREDATOR: predictive false sharing detection

Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

False sharing is a major class of performance bugs in parallel applications. Detecting false sharing is difficult as it does not change the program semantics. We introduce an efficient and effective approach for detecting false sharing based on machine learning. We develop a set of mini-programs in which false sharing can be turned on and off. We then run the mini-programs both with and without false sharing, collect a set of hardware performance event counts and use the collected data to train a classifier. We can use the trained classifier to analyze data from arbitrary programs for detection of false sharing. Experiments with the PARSEC and Phoenix benchmarks show that our approach is indeed effective. We detect published false sharing regions in the benchmarks with zero false positives. Our performance penalty is less than 2%. Thus, we believe that this is an effective and practical method for detecting false sharing.