A clustering model based on matrix approximation with applications to cluster system log files

  • Authors:
  • Tao Li;Wei Peng

  • Affiliations:
  • School of Computer Science, Florida International University, Miami, FL;School of Computer Science, Florida International University, Miami, FL

  • Venue:
  • ECML'05 Proceedings of the 16th European conference on Machine Learning
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In system management applications, to perform automated analysis of the historical data across multiple components when problems occur, we need to cluster the log messages with disparate formats to automatically infer the common set of semantic situations and obtain a brief description for each situation. In this paper, we propose a clustering model where the problem of clustering is formulated as matrix approximations and the clustering objective is minimizing the approximation error between the original data matrix and the reconstructed matrix based on the cluster structures. The model explicitly characterizes the data and feature memberships and thus enables the descriptions of each cluster. We present a two-side spectral relaxation optimization procedure for the clustering model. We also establish the connections between our clustering model with existing approaches. Experimental results show the effectiveness of the proposed approach.