Fast Text Caption Localization on Video Using Visual Rhythm

Authors:
Seong Soo Chun;Hyeokman Kim;Jung-Rim Kim;Sangwook Oh;Sanghoon Sull
Affiliations:
-;-;-;-;-
Venue:
VISUAL '02 Proceedings of the 5th International Conference on Recent Advances in Visual Information Systems
Year:
2002

Citing 10
Cited 2

Automatic text recognition for video indexing

MULTIMEDIA '96 Proceedings of the fourth ACM international conference on Multimedia
Finding text in images

DL '97 Proceedings of the second ACM international conference on Digital libraries
Automatic Caption Localization in Compressed Video

IEEE Transactions on Pattern Analysis and Machine Intelligence
Detection of text captions in compressed domain video

MULTIMEDIA '00 Proceedings of the 2000 ACM workshops on Multimedia
Recognizing Characters in Scene Images

IEEE Transactions on Pattern Analysis and Machine Intelligence
Video OCR: indexing digital new libraries by recognition of superimposed captions

Multimedia Systems - Special section on video libraries
Automatic Text Extraction from Video for Content-Based Annotation and Retrieval

ICPR '98 Proceedings of the 14th International Conference on Pattern Recognition-Volume 1 - Volume 1
An Efficient Graphical Shot Verifier Incorporating Visual Rhythm

ICMCS '99 Proceedings of the IEEE International Conference on Multimedia Computing and Systems - Volume 2
Automatic text detection and tracking in digital video

IEEE Transactions on Image Processing
Rapid scene analysis on compressed video

IEEE Transactions on Circuits and Systems for Video Technology

A model-based iterative method for caption extraction in compressed MPEG video

SAMT'07 Proceedings of the semantic and digital media technologies 2nd international conference on Semantic Multimedia
Fast rotation-invariant video caption detection based on visual rhythm

CIARP'11 Proceedings of the 16th Iberoamerican Congress conference on Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, a fast DCT-based algorithm is proposed to efficiently locate text captions embedded on specific areas in a video sequence through visual rhythm, which can be fast constructed by sampling certain portions of a DC image sequence and temporally accumulating the samples along time. Our proposed approach is based on the observations that the text captions carrying important information suitable for indexing often appear on specific areas on video frames, from where sampling strategies are derived for a visual rhythm. Our method then uses a combination of contrast and temporal coherence information on the visual rhythm to detect text frames such that each detected text frame represents consecutive frames containing identical text strings, thus significantly reducing the amount of text frames needed to be examined for text localization from a video sequence. It then utilizes several important properties of text caption to locate the text caption from the detected frames.