CASSANDRA: audio-video sensor fusion for aggression detection

  • Authors:
  • W. Zajdel;J. D. Krijnders;T. Andringa;D. M. Gavrila

  • Affiliations:
  • Intelligent Systems Laboratory, Faculty of Science, University of Amsterdam, USA;Auditory Cognition Group, Artificial Intelligence, Rijksuniversiteit Groningen, Germany;Auditory Cognition Group, Artificial Intelligence, Rijksuniversiteit Groningen, Germany;Intelligent Systems Laboratory, Faculty of Science, University of Amsterdam, USA

  • Venue:
  • AVSS '07 Proceedings of the 2007 IEEE Conference on Advanced Video and Signal Based Surveillance
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a smart surveillance system named CASSANDRA, aimed at detecting instances of aggressive human behavior in public environments. A distinguishing aspect of CASSANDRA is the exploitation of the complimentary nature of audio and video sensing to disambiguate scene activity in real-life, noisy and dynamic environments. At the lower level, independent analysis of the audio and video streams yields intermediate descriptors of a scene like: “scream”, “passing train” or “articulation energy”. At the higher level, a Dynamic Bayesian Network is used as a fusion mechanism that produces an aggregate aggression indication for the current scene. Our prototype system is validated on a set of scenarios performed by professional actors at an actual train station to ensure a realistic audio and video noise setting.