Probabilistic reasoning in intelligent systems: networks of plausible inference
Probabilistic reasoning in intelligent systems: networks of plausible inference
Machine Learning - Special issue on learning with probabilistic representations
Unconstrained energy functionals for electronic structure calculations
Journal of Computational Physics
A Supernodal Approach to Sparse Partial Pivoting
SIAM Journal on Matrix Analysis and Applications
Machine Learning
Integrated Performance Monitoring of a Cosmology Application on Leading HEC Platforms
ICPP '05 Proceedings of the 2005 International Conference on Parallel Processing
Towards automatic translation of OpenMP to MPI
Proceedings of the 19th annual international conference on Supercomputing
Lightweight monitoring of MPI programs in real time: Research Articles
Concurrency and Computation: Practice & Experience
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
An approach for matching communication patterns in parallel applications
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Transforming MPI source code based on communication patterns
Future Generation Computer Systems
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Effective Performance Measurement at Petascale Using IPM
ICPADS '10 Proceedings of the 2010 IEEE 16th International Conference on Parallel and Distributed Systems
Data Mining: Practical Machine Learning Tools and Techniques
Data Mining: Practical Machine Learning Tools and Techniques
Approximating discrete probability distributions with dependence trees
IEEE Transactions on Information Theory
Hi-index | 0.10 |
High Performance Computing (HPC) is a field concerned with solving large-scale problems in science and engineering. However, the computational infrastructure of HPC systems can also be misused as demonstrated by the recent commoditization of cloud computing resources on the black market. As a first step towards addressing this, we introduce a machine learning approach for classifying distributed parallel computations based on communication patterns between compute nodes. We first provide relevant background on message passing and computational equivalence classes called dwarfs and describe our exploratory data analysis using self organizing maps. We then present our classification results across 29 scientific codes using Bayesian networks and compare their performance against Random Forest classifiers. These models, trained with hundreds of gigabytes of communication logs collected at Lawrence Berkeley National Laboratory, perform well without any a priori information and address several shortcomings of previous approaches.