Hierarchical Hidden Markov Model in detecting activities of daily living in wearable videos for studies of dementia

Authors:
Svebor Karaman;Jenny Benois-Pineau;Vladislavs Dovgalecs;Rémi Mégret;Julien Pinquier;Régine André-Obrecht;Yann Gaëstel;Jean-François Dartigues
Affiliations:
LaBRI--University of Bordeaux, Talence Cedex, France 33405;LaBRI--University of Bordeaux, Talence Cedex, France 33405;IMS--University of Bordeaux, Talence, France;IMS--University of Bordeaux, Talence, France;IRIT--University of Toulouse, Toulouse Cedex 9, France 31062;IRIT--University of Toulouse, Toulouse Cedex 9, France 31062;INSERM U.897--University Victor Ségalen Bordeaux 2, Bordeaux, France;INSERM U.897--University Victor Ségalen Bordeaux 2, Bordeaux, France
Venue:
Multimedia Tools and Applications
Year:
2014

Citing 25
Cited 0

Nonlinear component analysis as a kernel eigenvalue problem

Neural Computation
The Hierarchical Hidden Markov Model: Analysis and Applications

Machine Learning
Recognition of Visual Activities and Interactions by Stochastic Parsing

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Tutorial on Support Vector Machines for Pattern Recognition

Data Mining and Knowledge Discovery
Introduction to MPEG-7: Multimedia Content Description Interface

Introduction to MPEG-7: Multimedia Content Description Interface
Recognizing Human Actions: A Local SVM Approach

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 3 - Volume 03
Creating Efficient Codebooks for Visual Recognition

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Audio indexing: primary components retrieval

Multimedia Tools and Applications
HMM based structuring of tennis videos using visual and audio cues

ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 3 (ICME '03) - Volume 03
Speeded-Up Robust Features (SURF)

Computer Vision and Image Understanding
Audiovisual integration with Segment Models for tennis video parsing

Computer Vision and Image Understanding
Rushes summarization by IRIM consortium: redundancy removal and multi-feature fusion

TVS '08 Proceedings of the 2nd ACM TRECVid Video Summarization Workshop
The SenseCam as a tool for task observation

BCS-HCI '08 Proceedings of the 22nd British HCI Group Annual Conference on People and Computers: Culture, Creativity, Interaction - Volume 2
A novel sequence representation for unsupervised analysis of human activities

Artificial Intelligence
Robust Sequential Data Modeling Using an Outlier Tolerant Hidden Markov Model

IEEE Transactions on Pattern Analysis and Machine Intelligence
On parsing visual sequences with the hidden Markov model

Journal on Image and Video Processing
A survey on vision-based human action recognition

Image and Vision Computing
Comparing evaluation protocols on the KTH dataset

HBU'10 Proceedings of the First international conference on Human behavior understanding
Event detection and recognition for semantic annotation of video

Multimedia Tools and Applications
Egocentric visual event classification with location-based priors

ISVC'10 Proceedings of the 6th international conference on Advances in visual computing - Volume Part II
Passively recognising human activities through lifelogging

Computers in Human Behavior
SenseCam: a retrospective memory aid

UbiComp'06 Proceedings of the 8th international conference on Ubiquitous Computing
Double fusion for multimedia event detection

MMM'12 Proceedings of the 18th international conference on Advances in Multimedia Modeling
Activity recognition using an egocentric perspective of everyday objects

UIC'07 Proceedings of the 4th international conference on Ubiquitous Intelligence and Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a method for indexing activities of daily living in videos acquired from wearable cameras. It addresses the problematic of analyzing the complex multimedia data acquired from wearable devices, which has been recently a growing concern due to the increasing amount of this kind of multimedia data. In the context of dementia diagnosis by doctors, patient activities are recorded in the environment of their home using a lightweight wearable device, to be later visualized by the medical practitioners. The recording mode poses great challenges since the video data consists in a single sequence shot where strong motion and sharp lighting changes often appear. Because of the length of the recordings, tools for an efficient navigation in terms of activities of interest are crucial. Our work introduces a video structuring approach that combines automatic motion based segmentation of the video and activity recognition by a hierarchical two-level Hidden Markov Model. We define a multi-modal description space over visual and audio features, including mid-level features such as motion, location, speech and noise detections. We show their complementarities globally as well as for specific activities. Experiments on real data obtained from the recording of several patients at home show the difficulty of the task and the promising results of the proposed approach.