A Heuristic Approach to Caption Enhancement for Effective Video OCR

Authors:
Lei Xie;Xi Tan
Affiliations:
School of Computer Science, Northwestern Polytechnical University, Xi'an, China and Human-Computer Communications Laboratory, The Chinese University of Hong Kong, Hong Kong SAR,;School of Computer Science, Northwestern Polytechnical University, Xi'an, China and Tsinghua-CUHK Joint Research Center for Media Sciences, Technologies and Systems, , Shenzhen, China
Venue:
ICIC '08 Proceedings of the 4th international conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications - with Aspects of Theoretical and Methodological Issues
Year:
2008

Citing 5
Cited 0

Video OCR: indexing digital new libraries by recognition of superimposed captions

Multimedia Systems - Special section on video libraries
Combined use of speaker- and tone-normalized pitch reset with pause duration for automatic story segmentation in Mandarin broadcast news

NAACL-Short '07 Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
Multi-scale TextTiling for automatic story segmentation in Chinese broadcast news

AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
A comprehensive method for multilingual video text detection, localization, and extraction

IEEE Transactions on Circuits and Systems for Video Technology
A spatial-temporal approach for video caption detection and recognition

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a heuristic approach to enhancing speech synchronized captions for video OCR, as a pre-process for subsequent tasks of multimedia indexing, segmentation and retrieval. We use a bi-search based caption transition detection method to improve efficiency, which adopts a simple heuristics that the same caption content usually lasts for a period for stable viewing. We propose a combination of color mask, changing mask and region mask to perform caption enhancement based on the discriminative characteristics of captions and backgrounds. Elaborate enhancement on individual characters is further used to remove small background residues. OCR experiments show that our caption enhancement approach brings a high character accuracy of 89.24%.