Probabilistic inference of object identifications for event stream analytics

  • Authors:
  • Di Wang;Elke Rundensteiner;Richard Ellison, III;Han Wang

  • Affiliations:
  • Worcester Polytechnic Institute;Worcester Polytechnic Institute;University of Massachusetts Medical School;Worcester Polytechnic Institute

  • Venue:
  • Proceedings of the 16th International Conference on Extending Database Technology
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recent years have witnessed the emergence of real-time object monitoring applications driven by the explosion of small inexpensive sensors. In many real-world applications, not all sensed events carry the identification of the object whose action they report on, so called "non-ID-ed" events. Reasons range from heterogeneous sensing devices to human's choosing to conceal their identifications. Such non-ID-ed events prevent us from performing object-based analytics, such as tracking, alerting and pattern matching. We propose a probabilistic inference framework, called FISS, to tackle this problem by inferring the missing object identification associated with an event. Specifically, as a foundation we design a time-varying graphic model to capture correspondences between sensed events and objects. Upon this formal model, we elaborate how to adapt the Forward-backward (FB) inference algorithm to continuously infer probabilistic identifications for non-ID-ed events. However, we demonstrate that FB is neither scalable nor efficient over event streams. To overcome this deficiency, we propose a suite of strategies for optimizing its performance, including the selective smoothing technique that significantly reduces the number of random variables that need to be smoothed, and the finish-flag mechanism that enables early termination of backward computations. Our experimental results, using large-volume streams of a real-world healthcare application, demonstrate the accuracy, efficiency, and scalability of FISS. Especially FISS achieves on average 15x higher throughput than our basic FB inference.