Decision-Level Fusion for Audio-Visual Laughter Detection
MLMI '08 Proceedings of the 5th international workshop on Machine Learning for Multimodal Interaction
Acoustic Based Surveillance System for Intrusion Detection
AVSS '09 Proceedings of the 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance
Distributed activity recognition using key sensors
ICACT'09 Proceedings of the 11th international conference on Advanced Communication Technology - Volume 3
A bottom-up approach of fusion of events in surveillance systems
CompSysTech '09 Proceedings of the International Conference on Computer Systems and Technologies and Workshop for PhD Students in Computing
Sound event recognition through expectancy-based evaluation ofsignal-driven hypotheses
Pattern Recognition Letters
Color based tracing in real-life surveillance data
Transactions on data hiding and multimedia security V
Risk analysis of a video-surveillance system
Proceedings of the 12th International Conference on Computer Systems and Technologies
Addressing multimodality in overt aggression detection
TSD'11 Proceedings of the 14th international conference on Text, speech and dialogue
Violence detection in video using computer vision techniques
CAIP'11 Proceedings of the 14th international conference on Computer analysis of images and patterns - Volume Part II
Detecting F-formations as dominant sets
ICMI '11 Proceedings of the 13th international conference on multimodal interfaces
Audio-Visual fusion for detecting violent scenes in videos
SETN'10 Proceedings of the 6th Hellenic conference on Artificial Intelligence: theories, models and applications
A naive mid-level concept-based fusion approach to violence detection in Hollywood movies
Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
An evidential fusion approach for activity recognition in ambient intelligence environments
Robotics and Autonomous Systems
A comparative study on automatic audio-visual fusion for aggression detection using meta-information
Pattern Recognition Letters
Hi-index | 0.00 |
This paper presents a smart surveillance system named CASSANDRA, aimed at detecting instances of aggressive human behavior in public environments. A distinguishing aspect of CASSANDRA is the exploitation of the complimentary nature of audio and video sensing to disambiguate scene activity in real-life, noisy and dynamic environments. At the lower level, independent analysis of the audio and video streams yields intermediate descriptors of a scene like: “scream”, “passing train” or “articulation energy”. At the higher level, a Dynamic Bayesian Network is used as a fusion mechanism that produces an aggregate aggression indication for the current scene. Our prototype system is validated on a set of scenarios performed by professional actors at an actual train station to ensure a realistic audio and video noise setting.