LogSig: generating system events from raw textual logs

  • Authors:
  • Liang Tang;Tao Li;Chang-Shing Perng

  • Affiliations:
  • Florida International University, Miami, FL, USA;Florida International University, Miami, FL, USA;IBM T.J. Watson Research Center, Hawthorne, NY, USA

  • Venue:
  • Proceedings of the 20th ACM international conference on Information and knowledge management
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Modern computing systems generate large amounts of log data. System administrators or domain experts utilize the log data to understand and optimize system behaviors. Most system logs are raw textual and unstructured. One main fundamental challenge in automated log analysis is the generation of system events from raw textual logs. Log messages are relatively short text messages but may have a large vocabulary, which often result in poor performance when applying traditional text clustering techniques to the log data. Other related methods have various limitations and only work well for some particular system logs. In this paper, we propose a message signature based algorithm logSig to generate system events from textual log messages. By searching the most representative message signatures, logSig categorizes log messages into a set of event types. logSig can handle various types of log data, and is able to incorporate human's domain knowledge to achieve a high performance. We conduct experiments on five real system log data. Experiments show that logSig outperforms other alternative algorithms in terms of the overall performance.