Mining console logs for large-scale system problem detection

  • Authors:
  • Wei Xu;Ling Huang;Armando Fox;David Patterson;Michael Jordan

  • Affiliations:
  • UC Berkeley;Intel Research Berkeley;UC Berkeley;UC Berkeley;UC Berkeley

  • Venue:
  • SysML'08 Proceedings of the Third conference on Tackling computer systems problems with machine learning techniques
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The console logs generated by an application contain messages that the application developers believed would be useful in debugging or monitoring the application. Despite the ubiquity and large size of these logs, they are rarely exploited in a systematic way for monitoring and debugging because they are not readily machine-parsable. In this paper, we propose a novel method for mining this rich source of information. First, we combine log parsing and text mining with source code analysis to extract structure from the console logs. Second, we extract features from the structured information in order to detect anomalous patterns in the logs using Principal Component Analysis (PCA). Finally, we use a decision tree to distill the results of PCA-based anomaly detection to a format readily understandable by domain experts (e.g. system operators) who need not be familiar with the anomaly detection algorithms. As a case study, we distill over one million lines of console logs from the Hadoop file system to a simple decision tree that a domain expert can readily understand; the process requires no operator intervention and we detect a large portion of runtime anomalies that are commonly overlooked.