VAST MM: multimedia browser for presentation video
Proceedings of the 6th ACM international conference on Image and video retrieval
Handling label noise in video classification via multiple instance learning
ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
Fast Near-Duplicate Video Retrieval via Motion Time Series Matching
ICME '12 Proceedings of the 2012 IEEE International Conference on Multimedia and Expo
Hi-index | 0.00 |
The growth of digitally recorded educational lectures has led to a problem of information overload. Semantic video browsers present one solution whereby content-based features are used to highlight points of interest. We focus on the domain of single-instructor lecture videos. We hypothesize that arm and upper body gestures made by the instructor can yield significant pedagogic information regarding the content being discussed such as importance and difficulty. Furthermore, these gestures may be classified, automatically detected and correlated to pedagogic significance (e.g., highlighting a subtopic which may be a focal point of a lecture). This information may be used as cues for a semantic video browser. We propose a fully automatic system which, given a lecture video as input, will segment the video into gestures and then identify each gesture according to a refined taxonomy. These gestures will then be correlated to a vocabulary of significance. We also plan to extract other features of gestures such as speed and size and examine their correlation to pedagogic significance. We propose to develop body part recognition and temporal segmentation techniques to aid natural gesture recognition. Finally, we plan to test and verify the efficacy of this hypothesis and system on a corpus of lecture videos by integrating the points of pedagogic significance as indicated by the gestural information into a semantic video browser and performing user studies. The user studies will measure the accuracy of the correlation as well as the usefulness of the integrated browser.