An efficient method for text detection in video based on stroke width similarity

Authors:
Viet Cuong Dinh;Seong Soo Chun;Seungwook Cha;Hanjin Ryu;Sanghoon Sull
Affiliations:
Department of Electronics and Computer Engineering, Korea University, Seoul, Korea;Department of Electronics and Computer Engineering, Korea University, Seoul, Korea;Department of Electronics and Computer Engineering, Korea University, Seoul, Korea;Department of Electronics and Computer Engineering, Korea University, Seoul, Korea;Department of Electronics and Computer Engineering, Korea University, Seoul, Korea
Venue:
ACCV'07 Proceedings of the 8th Asian conference on Computer vision - Volume Part I
Year:
2007

Citing 10
Cited 3

Text enhancement in digital video using multiple frame integration

MULTIMEDIA '99 Proceedings of the seventh ACM international conference on Multimedia (Part 1)
Digital Image Processing

Digital Image Processing
Recognizing Characters in Scene Images

IEEE Transactions on Pattern Analysis and Machine Intelligence
Automatic Text Location in Images and Video Frames

ICPR '98 Proceedings of the 14th International Conference on Pattern Recognition-Volume 2 - Volume 2
Multimodal fusion using learned text concepts for image categorization

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Gabor Filter Based Text Extraction from Digital Document Images

IIH-MSP '06 Proceedings of the 2006 International Conference on Intelligent Information Hiding and Multimedia
Automatic text detection and tracking in digital video

IEEE Transactions on Image Processing
Concept-oriented indexing of video databases: toward semantic sensitive retrieval and browsing

IEEE Transactions on Image Processing
Localizing and segmenting text in images and videos

IEEE Transactions on Circuits and Systems for Video Technology
A comprehensive method for multilingual video text detection, localization, and extraction

IEEE Transactions on Circuits and Systems for Video Technology

Extracting captions from videos using temporal feature

Proceedings of the international conference on Multimedia
Text extraction from scene images by character appearance and structure modeling

Computer Vision and Image Understanding
Transform invariant text extraction

The Visual Computer: International Journal of Computer Graphics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Text appearing in video provides semantic knowledge and significant information for video indexing and retrieval system. This paper proposes an effective method for text detection in video based on the similarity in stroke width of text (which is defined as the distance between two edges of a stroke). From the observation that text regions can be characterized by a dominant fixed stroke width, edge detection with local adaptive thresholds is first devised to keep text- while reducing background-regions. Second, morphological dilation operator with adaptive structuring element size determined by stroke width value is exploited to roughly localize text regions. Finally, to reduce false alarm and refine text location, a new multi-frame refinement method is applied. Experimental results show that the proposed method is not only robust to different levels of background complexity, but also effective to different fonts (size, color) and languages of text.