Spatial-temporal semantic grouping of instructional video content

Authors:
Tiecheng Liu;John R. Kender
Affiliations:
Department of Computer Science, Columbia University, New York;Department of Computer Science, Columbia University, New York
Venue:
CIVR'03 Proceedings of the 2nd international conference on Image and video retrieval
Year:
2003

Citing 10
Cited 2

A video retrieval and sequencing system

ACM Transactions on Information Systems (TOIS) - Special issue on video information retrieval
The computation of optical flow

ACM Computing Surveys (CSUR)
Auto-summarization of audio-video presentations

MULTIMEDIA '99 Proceedings of the seventh ACM international conference on Multimedia (Part 1)
Detecting topical events in digital video

MULTIMEDIA '00 Proceedings of the eighth ACM international conference on Multimedia
Bringing the Marks on a Whiteboard to Electronic Life

CoBuild '99 Proceedings of the Second International Workshop on Cooperative Buildings, Integrating Information, Organization, and Architecture
A Hidden Markov Model Approach to the Structure of Documentaries

CBAIVL '00 Proceedings of the IEEE Workshop on Content-based Access of Image and Video Libraries (CBAIVL'00)
Time-Constrained Clustering for Segmentation of Video into Story Unites

ICPR '96 Proceedings of the International Conference on Pattern Recognition (ICPR '96) Volume III-Volume 7276 - Volume 7276
Blackboard Segmentation Using Video Image of Lecture and Its Applications

ICPR '00 Proceedings of the International Conference on Pattern Recognition - Volume 4
Summarization of videotaped presentations: automatic analysis of motion and gesture

IEEE Transactions on Circuits and Systems for Video Technology
Efficient video indexing scheme for content-based retrieval

IEEE Transactions on Circuits and Systems for Video Technology

Virtual videography

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
The state of the art in image and video retrieval

CIVR'03 Proceedings of the 2nd international conference on Image and video retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a new approach for content analysis and semantic summarization of instructional videos of blackboard presentations. We first use low-level image processing techniques to segment frames into board content regions, regions occluded by instructors, and irrelevant areas, then measure the number of chalk pixels in the content areas of each frame. Using the number of chalk pixels as heuristic measurement of video content, we derive a content figure which describes the actual rather than apparent fluctuation of video content. By searching for local maxima in the content figure, and by detecting camera motions and tracking movements of instructors, we can then define and retrieve key frames. Since some video content may not be contained in any one of the key frames due to occlusion by instructors or camera motion, we use an image registration method to make "board content images" that are free of occlusions and not bound by frame boundaries. Extracted key frames and board content images are combined together to summarize and index the video. We further introduce the concept of "semantic teaching unit", which is defined as a more natural semantic temporalspatial unit of teaching content. We propose a model to detect semantic teaching units, based on the recognition of actions of instructors, and on the measurement of temporal duration and spatial location of board content. We demonstrate experiments on instructional videos which are taken in non-instrumented classrooms, and show examples of the construction of board content images and the detection of semantic teaching units within them.