Automatic location of text in video frames
MULTIMEDIA '01 Proceedings of the 2001 ACM workshops on Multimedia: multimedia information retrieval
Locating Uniform-Colored Text in Video Frames
ICPR '00 Proceedings of the International Conference on Pattern Recognition - Volume 4
A Novel Video Caption Detection Approach Using Multi-Frame Integration
ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 1 - Volume 01
An Unsupervised Approach for Video Text Localization
IEICE - Transactions on Information and Systems
Text Extraction in Video Images
SSIRI '08 Proceedings of the 2008 Second International Conference on Secure System Integration and Reliability Improvement
A Hybrid System for Text Detection in Video Frames
DAS '08 Proceedings of the 2008 The Eighth IAPR International Workshop on Document Analysis Systems
2DVTE: A two-directional videotext extractor for rapid and elaborate design
Pattern Recognition
A comprehensive method for multilingual video text detection, localization, and extraction
IEEE Transactions on Circuits and Systems for Video Technology
Robust news video text detection based on edges and line-deletion
WSEAS Transactions on Signal Processing
Hi-index | 0.00 |
This paper presents a multiple frames integration based approach to detect and localize static caption texts on news videos. Utilizing the temporal information of videos, the algorithm includes robust text features and the non-text line deletion technique, and yields precise and tight localization for detected text regions. The Canny edge detector is first applied on reference frames and is followed by executing the logical AND to reduce the edges from the variation of the background including the scrolling texts. Next, rough text candidate regions are determined by calculating the number black-white transition (BWT). Finally, the text regions are refined by the non-text line deletion technique. The proposed algorithm is applicable to multiple languages and robust to text polarities, alignments, and character sizes (from 10×10 to 30×30). According to the experimental results on various multilingual video sequences, the proposed algorithm has a 96% and above performance in recall, precision, and quality of bounding preciseness.