Efficient recovery of missing events

  • Authors:
  • Jianmin Wang;Shaoxu Song;Xiaochen Zhu;Xuemin Lin

  • Affiliations:
  • Key Laboratory for Information System Security, MOE, TNList, School of Software, Tsinghua University, Beijing, China;Key Laboratory for Information System Security, MOE, TNList, School of Software, Tsinghua University, Beijing, China;Key Laboratory for Information System Security, MOE, TNList, School of Software, Tsinghua University, Beijing, China;University of New South Wales, Sydney, Australia

  • Venue:
  • Proceedings of the VLDB Endowment
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

For various entering and transmission issues raised by human or system, missing events often occur in event data, which record execution logs of business processes. Without recovering these missing events, applications such as provenance analysis or complex event processing built upon event data are not reliable. Following the minimum change discipline in improving data quality, it is also rational to find a recovery that minimally differs from the original data. Existing recovery approaches fall short of efficiency owing to enumerating and searching over all the possible sequences of events. In this paper, we study the efficient techniques for recovering missing events. According to our theoretical results, the recovery problem is proved to be NP-hard. Nevertheless, we are able to concisely represent the space of event sequences in a branching framework. Advanced indexing and pruning techniques are developed to further improve the recovery efficiency. Our proposed efficient techniques make it possible to find top-k recoveries. The experimental results demonstrate that our minimum recovery approach achieves high accuracy, and significantly outperforms the state-of-the-art technique for up to 5 orders of magnitudes improvement in time performance.