Upper body gestures in lecture videos: indexing and correlating to pedagogical significance

Authors:
John R. Zhang
Affiliations:
Columbia University, New York, NY, USA
Venue:
Proceedings of the 20th ACM international conference on Multimedia
Year:
2012

Citing 3
Cited 0

VAST MM: multimedia browser for presentation video

Proceedings of the 6th ACM international conference on Image and video retrieval
Handling label noise in video classification via multiple instance learning

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
Fast Near-Duplicate Video Retrieval via Motion Time Series Matching

ICME '12 Proceedings of the 2012 IEEE International Conference on Multimedia and Expo

Quantified Score

Hi-index	0.00

Visualization

Abstract

The growth of digitally recorded educational lectures has led to a problem of information overload. Semantic video browsers present one solution whereby content-based features are used to highlight points of interest. We focus on the domain of single-instructor lecture videos. We hypothesize that arm and upper body gestures made by the instructor can yield significant pedagogic information regarding the content being discussed such as importance and difficulty. Furthermore, these gestures may be classified, automatically detected and correlated to pedagogic significance (e.g., highlighting a subtopic which may be a focal point of a lecture). This information may be used as cues for a semantic video browser. We propose a fully automatic system which, given a lecture video as input, will segment the video into gestures and then identify each gesture according to a refined taxonomy. These gestures will then be correlated to a vocabulary of significance. We also plan to extract other features of gestures such as speed and size and examine their correlation to pedagogic significance. We propose to develop body part recognition and temporal segmentation techniques to aid natural gesture recognition. Finally, we plan to test and verify the efficacy of this hypothesis and system on a corpus of lecture videos by integrating the points of pedagogic significance as indicated by the gestural information into a semantic video browser and performing user studies. The user studies will measure the accuracy of the correlation as well as the usefulness of the integrated browser.