Data Mining Approaches to Software Fault Diagnosis

  • Authors:
  • R. P. Jagadeesh Chandra Bose;S. H. Srinivasan

  • Affiliations:
  • Satyam Computer Services Ltd;Satyam Computer Services Ltd

  • Venue:
  • RIDE '05 Proceedings of the 15th International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automatic identification of software faults has enormous practical significance. This requires characterizing program execution behavior and the use of appropriate data mining techniques on the chosen representation. In this paper, we use the sequence of system calls to characterize program execution. The data mining tasks addressed are learning to map system call streams to fault labels and automatic identification of fault causes. Spectrum kernels and SVM are used for the former while latent semantic analysis is used for the latter. The techniques are demonstrated for the intrusion dataset containing system call traces. The results show that kernel techniques are as accurate as the best available results but are faster by orders of magnitude. We also show that latent semantic indexing is capable of revealing fault-specific features.