VastMM-Tag: a semantic tagging browser for unstructured videos

Authors:
Mitchell J. Morris;John R. Kender
Affiliations:
Columbia University, New York, NY, USA;Columbia University, New York, NY, USA
Venue:
MM '11 Proceedings of the 19th ACM international conference on Multimedia
Year:
2011

Citing 4
Cited 1

Browsing digital video

Proceedings of the SIGCHI conference on Human Factors in Computing Systems
VAST MM: multimedia browser for presentation video

Proceedings of the 6th ACM international conference on Image and video retrieval
VoxaleadNews: robust automatic segmentation of video into browsable content

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Sort-merge feature selection and fusion methods for classification of unstructured video

ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo

Arm gesture variations during presentations are correlated with conjunctions indicating contrast

Proceedings of the 2012 ACM workshop on User experience in e-learning and augmented technologies in education

Quantified Score

Hi-index	0.00

Visualization

Abstract

Quickly accessing the contents of a video is challenging for users, particularly for unstructured video, which contains no intentional shot boundaries, no chapters, and no apparent edited format. We approach this problem in the domain of lecture videos using machine learning and semantic display techniques. We extend an existing video browser, through a display of these machine-learned semantic labelings to provide the user with a multi-timeline semantic view. Each timeline corresponds to one semantic label and indicates the label's probable presence or absence in the associated frames. We also provide a full Boolean algebra over these labels, in order to accommodate more complex queries, such as 'text or code, but no instructor'. Finally, we quantify the effectiveness of our features and our browser through user studies on various tasks. We find that users follow a nearly fixed pattern of access, alternating between the use of these tags and keyframes, and also between the use of 'word bubbles' and the player. We show that the tag algebra is integral to the time efficient use of tag timelines, saving up to 27% of the time for various retrieval tasks.