Scanning workstation memory for malicious codes using dedicated coprocessors
Proceedings of the 44th annual Southeast regional conference
DMTracker: finding bugs in large-scale parallel programs by detecting anomaly in data movements
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Hi-index | 0.00 |
Message Passing Interface (MPI) is an effectiveprogramming technique for implementing parallelprograms for distributed computation. As theseapplications run, a number of different types ofirregularities can occur including those that result fromintrusions, user misbehavior, corrupted data, deadlocks orfailure of cluster components. In this paper, we perform acomparison of different artificial intelligence (AI)techniques that can be used to implement a lightweightmonitoring and detection system for parallel applicationson a cluster of Linux workstations. We study the accuracyand performance of deterministic and stochasticalgorithms when we observe the flow of function libraryand OS system calls of parallel programs written with MPI.We demonstrate that monitoring of MPI programs can beachieved with high accuracy and in some cases with a 0%false positive rate in real-time, and we show that the addedcomputational load on each node is small. Finally wedemonstrate that simple deterministic methods performpoorly when the program flow grows in size and variety,and that more complex methods are required.