Towards an optimized model of incident ticket correlation

  • Authors:
  • Patricia Marcu;Genady Grabarnik;Laura Luan;Daniela Rosu;Larisa Shwartz;Chris Ward

  • Affiliations:
  • Leibniz Supercomputing Center, Munich Network Management Team, Garching, Germany;IBM T. J. Watson Research Center, Hawthorne, NY;IBM T. J. Watson Research Center, Hawthorne, NY;IBM T. J. Watson Research Center, Hawthorne, NY;IBM T. J. Watson Research Center, Hawthorne, NY;IBM T. J. Watson Research Center, Hawthorne, NY

  • Venue:
  • IM'09 Proceedings of the 11th IFIP/IEEE international conference on Symposium on Integrated Network Management
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

In recent years, IT Service Management (ITSM) has become one of the most researched areas of IT. Incident and Problem Management are two of the Service Operation processes in the IT Infrastructure Library (ITIL). These two processes aim to recognize, log, isolate and correct errors which occur in the environment and disrupt the delivery of services. Incident Management and Problem Management form the basis of the tooling provided by an Incident Ticket Systems (ITS). In an ITS system, seemingly unrelated tickets created by end users and monitoring systems can coexist and have the same root cause. The connection between failed resource and malfunctioning services is not realized automatically, but often established manually by means of human intervention. This need for human involvement reduces productivity. The introduction of automation would increase productivity and therefore reduce the cost of incident resolution. In this paper, we propose a model to correlate incident tickets based on three criteria. First, we employ a category-based correlation that relies on matching service identifiers with associated resource identifiers, using similarity rules. Secondly, we correlate the configuration items which are critical to the failed service with the earlier identified resource tickets in order to optimize the topological comparison. Finally, we augment scheduled resource data collection with constraint adaptive probing to minimize the correlation interval for temporally correlated tickets. We present experimental data in support of our proposed correlation model.