Multimodal Technologies for Perception of Humans
An Appearance-Based Particle Filter for Visual Tracking in Smart Rooms
Multimodal Technologies for Perception of Humans
Long-time span acoustic activity analysis from far-field sensors in smart homes
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
CLEAR evaluation of acoustic event detection and classification systems
CLEAR'06 Proceedings of the 1st international evaluation conference on Classification of events, activities and relationships
Hi-index | 0.00 |
Smart homes for the aging population have recently started attracting the attention of the research community. One of the problems of interest is this of monitoring the activities of daily living (ADLs) of the elderly aiming at their protection and well-being. In this work, we present our initial efforts to automatically recognize ADLs using multimodal input from audio-visual sensors. For this purpose, and as part of Integrated Project Netcarity, far-field microphones and cameras have been installed inside an apartment and used to collect a corpus of ADLs, acted by multiple subjects. The resulting data streams are processed to generate perception-based acoustic features, as well as human location coordinates that are employed as visual features. The extracted features are then presented to Gaussian mixture models for their classification into a set of predefined ADLs. Our experimental results show that both acoustic and visual features are useful in ADL classification, but performance of the latter deteriorates when subject tracking becomes inaccurate. Furthermore, joint audio-visual classification by simple concatenative feature fusion significantly outperforms both unimodal classifiers.