The image processing handbook (3rd ed.)
The image processing handbook (3rd ed.)
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
FlySPEC: a multi-user video camera system with hybrid human and automatic control
Proceedings of the tenth ACM international conference on Multimedia
CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Activity monitoring and summarization for an intelligent meeting room
HUMO '00 Proceedings of the Workshop on Human Motion (HUMO'00)
Conceptual structures and computational methods for indexing and organization of visual information
Conceptual structures and computational methods for indexing and organization of visual information
ETP '03 Proceedings of the 2003 ACM SIGMM workshop on Experiential telepresence
Ontology and Taxonomy Collaborated Framework for Meeting Classification
ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 4 - Volume 04
Memory cues for meeting video retrieval
Proceedings of the the 1st ACM workshop on Continuous archival and retrieval of personal experiences
Memory cues for meeting video retrieval
Proceedings of the the 1st ACM workshop on Continuous archival and retrieval of personal experiences
A component-based multimedia a data model
Proceedings of the ACM workshop on Multimedia for human communication: from capture to convey
Design and evaluation of systems to support interaction capture and retrieval
Personal and Ubiquitous Computing - Special Issue: User-centred design and evaluation of ubiquitous groupware
Hotspot components for gesture-based interaction
INTERACT'05 Proceedings of the 2005 IFIP TC13 international conference on Human-Computer Interaction
Hi-index | 0.00 |
We present an application to create binary Visual Trigger Templates (VTT) for automatic video indexing. Our approach is based on the observation that videos captured with fixed cameras have specific structures that depend on world constraints. Our system allows a user to graphically represent such constraints to automatically recognize simple actions or events. VTTs are constructed by manually drawing rectangles to define trigger spaces: when elements (e.g., a hand, a face) move inside the trigger spaces defined by the user, actions are recognized. For example, a user can define a raise hand action by drawing two rectangles: one for the face and one for the hand. Our approach uses motion, skin, and face detection algorithms. We present experiments on the PETS-ICVS dataset and on our own dataset to demonstrate that our system constitutes a simple but powerful mechanism for meeting video indexing.