Automatic text recognition for video indexing
MULTIMEDIA '96 Proceedings of the fourth ACM international conference on Multimedia
DL '97 Proceedings of the second ACM international conference on Digital libraries
Automatic Caption Localization in Compressed Video
IEEE Transactions on Pattern Analysis and Machine Intelligence
Detection of text captions in compressed domain video
MULTIMEDIA '00 Proceedings of the 2000 ACM workshops on Multimedia
Recognizing Characters in Scene Images
IEEE Transactions on Pattern Analysis and Machine Intelligence
Video OCR: indexing digital new libraries by recognition of superimposed captions
Multimedia Systems - Special section on video libraries
Automatic Text Extraction from Video for Content-Based Annotation and Retrieval
ICPR '98 Proceedings of the 14th International Conference on Pattern Recognition-Volume 1 - Volume 1
An Efficient Graphical Shot Verifier Incorporating Visual Rhythm
ICMCS '99 Proceedings of the IEEE International Conference on Multimedia Computing and Systems - Volume 2
Automatic text detection and tracking in digital video
IEEE Transactions on Image Processing
Rapid scene analysis on compressed video
IEEE Transactions on Circuits and Systems for Video Technology
A model-based iterative method for caption extraction in compressed MPEG video
SAMT'07 Proceedings of the semantic and digital media technologies 2nd international conference on Semantic Multimedia
Fast rotation-invariant video caption detection based on visual rhythm
CIARP'11 Proceedings of the 16th Iberoamerican Congress conference on Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
Hi-index | 0.00 |
In this paper, a fast DCT-based algorithm is proposed to efficiently locate text captions embedded on specific areas in a video sequence through visual rhythm, which can be fast constructed by sampling certain portions of a DC image sequence and temporally accumulating the samples along time. Our proposed approach is based on the observations that the text captions carrying important information suitable for indexing often appear on specific areas on video frames, from where sampling strategies are derived for a visual rhythm. Our method then uses a combination of contrast and temporal coherence information on the visual rhythm to detect text frames such that each detected text frame represents consecutive frames containing identical text strings, thus significantly reducing the amount of text frames needed to be examined for text localization from a video sequence. It then utilizes several important properties of text caption to locate the text caption from the detected frames.