Localizing and Extracting Caption in News Video Using Multi-Frame Average

Authors:
Jinlin Guo;Songyang Lao;Haitao Liu;Jiang Bu
Affiliations:
School of Information Systems and Management, National University of Defence Technology NUDT, ChangSha, Hunan Province, China, e-mail: gjlin99@yahoo.com.cn;School of Information Systems and Management, National University of Defence Technology NUDT, ChangSha, Hunan Province, China, e-mail: gjlin99@yahoo.com.cn;School of Information Systems and Management, National University of Defence Technology NUDT, ChangSha, Hunan Province, China, e-mail: gjlin99@yahoo.com.cn;School of Information Systems and Management, National University of Defence Technology NUDT, ChangSha, Hunan Province, China, e-mail: gjlin99@yahoo.com.cn
Venue:
Proceedings of the 2008 conference on New Trends in Multimedia and Network Information Systems
Year:
2008

Citing 2
Cited 0

Informedia digital video library

MULTIMEDIA '94 Proceedings of the second ACM international conference on Multimedia
A Novel Video Caption Detection Approach Using Multi-Frame Integration

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 1 - Volume 01

Quantified Score

Hi-index	0.00

Visualization

Abstract

News video is a very important video source. Caption in a news video can help us to understand the semantics of video content directly. A caption localization and extraction approach for news video will be proposed. This approach applies a new Multi-Frame Average (MFA) method to reduce the complexity of the background of the image. A time-based average pixel value search is employed and a Canny edge detection is performed to get the edge map. Then, a horizontal scan and a vertical scan on this edge map are used to obtain the top, bottom, left and right boundaries of the rectangles of candidate captions. Then, some rules are applied to confirm the caption. Experimental results show that the proposed approach can reduce the background complexity in most cases, and achieves a high precision and recall. Finally, we analyze the relationship between background variation of frame sequence and detection performance in detail.