Mining logs files for data-driven system management

  • Authors:
  • Wei Peng;Tao Li;Sheng Ma

  • Affiliations:
  • Florida International University, Miami, FL;Florida International University, Miami, FL;IBM T.J. Watson Research Center, Hawthorne, NY

  • Venue:
  • ACM SIGKDD Explorations Newsletter - Natural language processing and text mining
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

With advancement in science and technology, computing systems are becoming increasingly more complex with an increasing variety of heterogeneous software and hardware components. They are thus becoming increasingly more difficult to monitor, manage and maintain. Traditional approaches to system management have been largely based on domain experts through a knowledge acquisition process that translates domain knowledge into operating rules and policies. This has been well known and experienced as a cumber-some, labor intensive, and error prone process. In addition, this process is difficult to keep up with the rapidly changing environments. There is thus a pressing need for automatic and efficient approaches to monitor and manage complex computing systems.A popular approach to system management is based on analyzing system log files. However, some new aspects of the log files have been less emphasized in existing methods from data mining and machine learning community. The various formats and relatively short text messages of log files, and temporal characteristics in data representation pose new challenges. In this paper, we will describe our research efforts on mining system log files for automatic management. In particular, we apply text mining techniques to categorize messages in log files into common situations, improve categorization accuracy by considering the temporal characteristics of log messages, and utilize visualization tools to evaluate and validate the interesting temporal patterns for system management.