Distributional clustering of words for text classification
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Text Document Categorization by Term Association
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
The Journal of Machine Learning Research
Pro Apache Log4j, Second Edition (Pro)
Pro Apache Log4j, Second Edition (Pro)
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
An integrated framework on mining logs files for computing system management
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
BlueGene/L Failure Analysis and Prediction Models
DSN '06 Proceedings of the International Conference on Dependable Systems and Networks
What Supercomputers Say: A Study of Five System Logs
DSN '07 Proceedings of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks
Adapting associative classification to text categorization
Proceedings of the 2007 ACM symposium on Document engineering
A review of associative classification mining
The Knowledge Engineering Review
PDA: a tool for automated problem determination
LISA'07 Proceedings of the 21st conference on Large Installation System Administration Conference
Bad Words: Finding Faults in Spirit's Syslogs
CCGRID '08 Proceedings of the 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid
Abstracting Execution Logs to Execution Events for Enterprise Applications (Short Paper)
QSIC '08 Proceedings of the 2008 The Eighth International Conference on Quality Software
Comparing Error Detection Techniques for Web Applications: An Experimental Study
NCA '08 Proceedings of the 2008 Seventh IEEE International Symposium on Network Computing and Applications
Pattern and Policy Driven Log Analysis for Software Monitoring
COMPSAC '08 Proceedings of the 2008 32nd Annual IEEE International Computer Software and Applications Conference
Automated Identification of Failure Causes in System Logs
ISSRE '08 Proceedings of the 2008 19th International Symposium on Software Reliability Engineering
Alert Detection in System Logs
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Clustering event logs using iterative partitioning
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Detecting large-scale system problems by mining console logs
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Discovering actionable patterns in event data
IBM Systems Journal
Execution Anomaly Detection in Distributed Systems through Unstructured Log Analysis
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Diagnosis of recurrent faults using log files
CASCON '09 Proceedings of the 2009 Conference of the Center for Advanced Studies on Collaborative Research
Error log processing for accurate failure prediction
WASL'08 Proceedings of the First USENIX conference on Analysis of system logs
Predicting computer system failures using support vector machines
WASL'08 Proceedings of the First USENIX conference on Analysis of system logs
Assisting failure diagnosis through filesystem instrumentation
Proceedings of the 2011 Conference of the Center for Advanced Studies on Collaborative Research
Provenance for system troubleshooting
LISA'11 Proceedings of the 25th international conference on Large Installation System Administration
Spatio-temporal decomposition, clustering and identification for alert detection in system logs
Proceedings of the 27th Annual ACM Symposium on Applied Computing
Hi-index | 0.00 |
System failures in industry are expensive, and the increasingly stringent requirements on performance and reliability of enterprise systems have made the detection and diagnosis of system failures crucial and challenging. Log files generated at the system runtime are considered to contain the representations of failure symptoms, and thus become one of the most important sources used for system monitoring and failure diagnosis. A number of studies suggest that data mining and machine learning can help in dealing with the vast amount of log data for a complex enterprise system. Log data abstraction techniques have been proposed, but have not been well studied for failure detection and problem determination. In this research, we investigate the effects of using an unsupervised log data abstraction method to aid the supervised learning processes of problem determination. Additionally, we compare the efficiency of associative classification methods for failure diagnosis against Bayesian Learning technique and C4.5 that have been proved good both in documentation classification and failure diagnosis. Our experimental results show that two associative classification methods outperform Naive Bayes and C4.5 when applied on non-abstracted logs, and unsupervised log abstraction helps to improve the performance of log-based problem determination significantly in terms of the precision, F-measure, and efficiency.