Automated sip detection in naturally-evoked video

  • Authors:
  • Rana el Kaliouby;Mina Mikhail

  • Affiliations:
  • Massachusetts Institute of Technology, Cambridge, MA, USA;American University in Cairo, Cairo, Egypt

  • Venue:
  • ICMI '08 Proceedings of the 10th international conference on Multimodal interfaces
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Quantifying consumer experiences is an emerging application area for event detection in video. This paper presents a hierarchical model for robust sip detection that combines bottom-up processing of face videos, namely real-time head action unit analysis and and head gesture recognition, with top-down knowledge about sip events and task semantics. Our algorithm achieves an average accuracy of 82% in videos that feature single sips, and an average accuracy of 78% and false positive rate of 0.3%, in more challenging videos that feature multiple sips and chewing actions. We discuss the generality of our methodology to detecting other events in similar contexts.