Pinpoint: Problem Determination in Large, Dynamic Internet Services
DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
Splitting-merging model of Chinese word tokenization and segmentation
Natural Language Engineering
Automated System Monitoring and Notification With Swatch
LISA '93 Proceedings of the 7th USENIX conference on System administration
Towards informatic analysis of syslogs
CLUSTER '04 Proceedings of the 2004 IEEE International Conference on Cluster Computing
Learning case-based knowledge for disambiguating Chinese word segmentation: a preliminary study
SIGHAN '02 Proceedings of the first SIGHAN workshop on Chinese language processing - Volume 18
Path-based faliure and evolution management
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
What Supercomputers Say: A Study of Five System Logs
DSN '07 Proceedings of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks
Refactoring support for the C++ development tooling
Companion to the 22nd ACM SIGPLAN conference on Object-oriented programming systems and applications companion
From dirt to shovels: fully automatic tool generation from ad hoc data
Proceedings of the 35th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Parallelism by design: data analysis with sawzall
Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Clustering event logs using iterative partitioning
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Detecting large-scale system problems by mining console logs
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Execution Anomaly Detection in Distributed Systems through Unstructured Log Analysis
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Online System Problem Detection by Mining Patterns of Console Logs
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
SherLog: error diagnosis by connecting clues from run-time logs
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
X-trace: a pervasive network tracing framework
NSDI'07 Proceedings of the 4th USENIX conference on Networked systems design & implementation
Leveraging existing instrumentation to automatically infer invariant-constrained models
Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
Inferring networked system models from behavior traces
Proceedings of the 2012 ACM conference on CoNEXT student workshop
Unifying FSM-inference algorithms through declarative specification
Proceedings of the 2013 International Conference on Software Engineering
An integrated framework for optimizing automatic monitoring systems in large IT infrastructures
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 0.00 |
We describe our early experience in applying our console log mining techniques [19, 20] to logs from production Google systems with thousands of nodes. This data set is five orders of magnitude in size and contains almost 20 times as many messages types as the Hadoop data set we used in [19]. It also has many properties that are unique to large scale production deployments (e.g., the system stays on for several months and multiple versions of the software can run concurrently). Our early experience shows that our techniques, including source code based log parsing, state and sequence based feature creation and problem detection, work well on this production data set. We also discuss our experience in using our log parser to assist the log sanitization.