Mining invariants from console logs for system problem detection

Authors:
Jian-Guang Lou;Qiang Fu;Shengqi Yang;Ye Xu;Jiang Li
Affiliations:
Microsoft Research Asia, Beijing, P. R. China;Microsoft Research Asia, Beijing, P. R. China;Dept. of Computer Science, Beijing Univ. of Posts and Telecom;Dept. of Computer Science, Nanjing University, P.R. China;Microsoft Research Asia, Beijing, P. R. China
Venue:
USENIXATC'10 Proceedings of the 2010 USENIX conference on USENIX annual technical conference
Year:
2010

Citing 16
Cited 7

Dynamically Discovering Likely Program Invariants to Support Program Evolution

IEEE Transactions on Software Engineering - Special issue on 1999 international conference on software engineering
Finding failures by cluster analysis of execution profiles

ICSE '01 Proceedings of the 23rd International Conference on Software Engineering
Automated System Monitoring and Notification With Swatch

LISA '93 Proceedings of the 7th USENIX conference on System administration
Refereed Papers: Real-time Log File Analysis Using the Simple Event Correlator (SEC)

LISA '04 Proceedings of the 18th USENIX conference on System administration
Multi-resolution Abnormal Trace Detection Using Varied-length N-grams and Automata

ICAC '05 Proceedings of the Second International Conference on Automatic Computing
Failure detection and localization in component based systems by online tracking

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Problem diagnosis in large-scale computing environments

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Automated known problem diagnosis with event traces

Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
What Supercomputers Say: A Study of Five System Logs

DSN '07 Proceedings of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks
Investigation of failure causes in workload-driven reliability testing

Fourth international workshop on Software quality assurance: in conjunction with the 6th ESEC/FSE joint meeting
Efficient and Scalable Algorithms for Inferring Likely Invariants in Distributed Systems

IEEE Transactions on Knowledge and Data Engineering
Exploiting Local and Global Invariants for the Management of Large Scale Information Systems

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Detecting large-scale system problems by mining console logs

Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Execution Anomaly Detection in Distributed Systems through Unstructured Log Analysis

ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
SALSA: analyzing logs as state machines

WASL'08 Proceedings of the First USENIX conference on Analysis of system logs
Mining console logs for large-scale system problem detection

SysML'08 Proceedings of the Third conference on Tackling computer systems problems with machine learning techniques

Mining program workflow from interleaved traces

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Leveraging existing instrumentation to automatically infer invariant-constrained models

Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
Healing online service systems via mining historical issue repositories

Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering
Fmeter: extracting indexable low-level system signatures by counting kernel function calls

Proceedings of the 13th International Middleware Conference
Using substructure mining to identify misbehavior in network provenance graphs

First International Workshop on Graph Data Management Experiences and Systems
Performance troubleshooting in data centers: an annotated bibliography?

ACM SIGOPS Operating Systems Review
NetCheck: network diagnoses from blackbox traces

NSDI'14 Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Detecting execution anomalies is very important to the maintenance and monitoring of large-scale distributed systems. People often use console logs that are produced by distributed systems for troubleshooting and problem diagnosis. However, manually inspecting console logs for the detection of anomalies is unfeasible due to the increasing scale and complexity of distributed systems. Therefore, there is great demand for automatic anomaly detection techniques based on log analysis. In this paper, we propose an unstructured log analysis technique for anomaly detection, with a novel algorithm to automatically discover program invariants in logs. At first, a log parser is used to convert the unstructured logs to structured logs. Then, the structured log messages are further grouped to log message groups according to the relationship among log parameters. After that, the program invariants are automatically mined from the log message groups. The mined invariants can reveal the inherent linear characteristics of program work flows. With these learned invariants, our technique can automatically detect anomalies in logs. Experiments on Hadoop show that the technique can effectively detect execution anomalies. Compared with the state of art, our approach can not only detect numerous real problems with high accuracy but also provide intuitive insight into the problems.