A spatial-temporal approach for video caption detection and recognition

Authors:
Xiaoou Tang;Xinbo Gao;Jianzhuang Liu;Hongjiang Zhang
Affiliations:
Dept. of Inf. Eng., Chinese Univ. of Hong Kong;-;-;-
Venue:
IEEE Transactions on Neural Networks
Year:
2002

Citing 0
Cited 13

A Robust Algorithm for Text Detection in Color Images

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
A cross-modal approach for karaoke artifacts correction

Multimedia Tools and Applications
Detection and Recognition of Scoreboard for Baseball Videos

ICIC '08 Proceedings of the 4th international conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications - with Aspects of Theoretical and Methodological Issues
A Heuristic Approach to Caption Enhancement for Effective Video OCR

ICIC '08 Proceedings of the 4th international conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications - with Aspects of Theoretical and Methodological Issues
An Automatic Video Text Detection, Localization and Extraction Approach

Advanced Internet Based Systems and Applications
A robust caption detecting algorithm on MPEG compressed video

MCAM'07 Proceedings of the 2007 international conference on Multimedia content analysis and mining
Temporally consistent caption detection in videos using a spatiotemporal 3D method

ICIP'09 Proceedings of the 16th IEEE international conference on Image processing
Extracting captions from videos using temporal feature

Proceedings of the international conference on Multimedia
A video text detection method based on key text points

PCM'10 Proceedings of the 11th Pacific Rim conference on Advances in multimedia information processing: Part I
Localization and recognition of the scoreboard in sports video based on SIFT point matching

MMM'11 Proceedings of the 17th international conference on Advances in multimedia modeling - Volume Part II
A new text detection algorithm in images/video frames

PCM'04 Proceedings of the 5th Pacific Rim Conference on Advances in Multimedia Information Processing - Volume Part II
A robust text segmentation approach in complex background based on multiple constraints

PCM'05 Proceedings of the 6th Pacific-Rim conference on Advances in Multimedia Information Processing - Volume Part I
A novel ring radius transform for video character reconstruction

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a video caption detection and recognition system based on a fuzzy-clustering neural network (FCNN) classifier. Using a novel caption-transition detection scheme we locate both spatial and temporal positions of video captions with high precision and efficiency. Then employing several new character segmentation and binarization techniques, we improve the Chinese video-caption recognition accuracy from 13% to 86% on a set of news video captions. As the first attempt on Chinese video-caption recognition, our experiment results are very encouraging.