Exploiting multi-level parallelism for low-latency activity recognition in streaming video

  • Authors:
  • Ming-yu Chen;Lily Mummert;Padmanabhan Pillai;Alexander Hauptmann;Rahul Sukthankar

  • Affiliations:
  • Carnegie Mellon University, Pittsburgh, PA, USA;Intel Labs Pittsburgh, Pittsburgh, PA, USA;Intel Labs Pittsburgh, Pittsburgh, PA, USA;Carnegie Mellon University, Pittsburgh, PA, USA;Intel Labs Pittsburgh, Pittsburgh, PA, USA

  • Venue:
  • MMSys '10 Proceedings of the first annual ACM SIGMM conference on Multimedia systems
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Video understanding is a computationally challenging task that is critical not only for traditionally throughput-oriented applications such as search but also latency-sensitive interactive applications such as surveillance, gaming, videoconferencing, and vision-based user interfaces. Enabling these types of video processing applications will require not only new algorithms and techniques, but new runtime systems that optimize latency as well as throughput. In this paper, we present a runtime system called Sprout that achieves low latency by exploiting the parallelism inherent in video understanding applications. We demonstrate the utility of our system on an activity recognition application that employs a robust new descriptor called MoSIFT, which explicitly augments appearance features with motion information. MoSIFT outperforms previous recognition techniques, but like other state-of-the-art techniques, it is computationally expensive -- a sequential implementation runs 100 times slower than real time. We describe the implementation of the activity recognition application on Sprout, and show that it can accurately recognize activities at full frame rate (25 fps) and low latency on a challenging airport surveillance video corpus.