Instructing people for training gestural interactive systems

Authors:
Simon Fothergill;Helena Mentis;Pushmeet Kohli;Sebastian Nowozin
Affiliations:
University of Cambridge, Cambridge, United Kingdom;Microsoft Research, Cambridge, United Kingdom;Microsoft Research, Cambridge, United Kingdom;Microsoft Research, Cambridge, United Kingdom
Venue:
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Year:
2012

Citing 12
Cited 9

Information Retrieval

Information Retrieval
Random Forests

Machine Learning
Recognizing Human Actions: A Local SVM Approach

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 3 - Volume 03
A Full-Body Gesture Database for Automatic Gesture Recognition

FGR '06 Proceedings of the 7th International Conference on Automatic Face and Gesture Recognition
Actions as Space-Time Shapes

IEEE Transactions on Pattern Analysis and Machine Intelligence
Modeling the Model Athlete: Automatic Coaching of Rowing Technique

SSPR & SPR '08 Proceedings of the 2008 Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
A survey on vision-based human action recognition

Image and Vision Computing
Human activity analysis: A review

ACM Computing Surveys (CSUR)
Why is the recognition of spontaneous speech so hard?

TSD'05 Proceedings of the 8th international conference on Text, Speech and Dialogue
Real-time human pose recognition in parts from single depth images

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Machine Recognition of Human Activities: A Survey

IEEE Transactions on Circuits and Systems for Video Technology
HMDB: A large video database for human motion recognition

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision

Using embodied allegories to design gesture suites for human-data interaction

Proceedings of the 2012 ACM Conference on Ubiquitous Computing
Exploring the Trade-off Between Accuracy and Observational Latency in Action Recognition

International Journal of Computer Vision
On the relation of ordinary gestures to TV screens: general lessons for the design of collaborative interactive techniques

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Online action recognition by template matching

HIS'13 Proceedings of the second international conference on Health Information Science
Online human gesture recognition from motion data streams

Proceedings of the 21st ACM international conference on Multimedia
ChAirGest: a challenge for multimodal mid-air gesture recognition for close HCI

Proceedings of the 15th ACM on International conference on multimodal interaction
Analyzing touchless hand gestures performance

Proceedings of the 2013 Chilean Conference on Human - Computer Interaction
Evolutionary joint selection to improve human action recognition with RGB-D devices

Expert Systems with Applications: An International Journal
Human action recognition using a temporal hierarchy of covariance descriptors on 3D joint locations

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.01

Visualization

Abstract

Entertainment and gaming systems such as the Wii and XBox Kinect have brought touchless, body-movement based interfaces to the masses. Systems like these enable the estimation of movements of various body parts from raw inertial motion or depth sensor data. However, the interface developer is still left with the challenging task of creating a system that recognizes these movements as embodying meaning. The machine learning approach for tackling this problem requires the collection of data sets that contain the relevant body movements and their associated semantic labels. These data sets directly impact the accuracy and performance of the gesture recognition system and should ideally contain all natural variations of the movements associated with a gesture. This paper addresses the problem of collecting such gesture datasets. In particular, we investigate the question of what is the most appropriate semiotic modality of instructions for conveying to human subjects the movements the system developer needs them to perform. The results of our qualitative and quantitative analysis indicate that the choice of modality has a significant impact on the performance of the learnt gesture recognition system; particularly in terms of correctness and coverage.