A framework for improved video text detection and recognition

Authors:
Haojin Yang;Bernhard Quehl;Harald Sack
Affiliations:
Hasso-Plattner-Institute for IT-Systems Engineering, University of Potsdam, Potsdam, Germany 14467;Hasso-Plattner-Institute for IT-Systems Engineering, University of Potsdam, Potsdam, Germany 14467;Hasso-Plattner-Institute for IT-Systems Engineering, University of Potsdam, Potsdam, Germany 14467
Venue:
Multimedia Tools and Applications
Year:
2014

Citing 25
Cited 0

An introduction to digital image processing

An introduction to digital image processing
A Computational Approach to Edge Detection

IEEE Transactions on Pattern Analysis and Machine Intelligence
Automatic Caption Localization in Compressed Video

IEEE Transactions on Pattern Analysis and Machine Intelligence
Automatic location of text in video frames

MULTIMEDIA '01 Proceedings of the 2001 ACM workshops on Multimedia: multimedia information retrieval
Video OCR: indexing digital new libraries by recognition of superimposed captions

Multimedia Systems - Special section on video libraries
Identification of Text on Colored Book and Journal Covers

ICDAR '99 Proceedings of the Fifth International Conference on Document Analysis and Recognition
Text Detection in Images Based on Unsupervised Classification of High-Frequency Wavelet Coefficients

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 1 - Volume 01
Image Analysis and Mathematical Morphology

Image Analysis and Mathematical Morphology
Detecting Text in Videos Using Fuzzy Clustering Ensembles

ISM '06 Proceedings of the Eighth IEEE International Symposium on Multimedia
Color text extraction with selective metric-based clustering

Computer Vision and Image Understanding
Text detection, localization, and tracking in compressed video

Image Communication
A Robust System to Detect and Localize Texts in Natural Scene Images

DAS '08 Proceedings of the 2008 The Eighth IAPR International Workshop on Document Analysis Systems
Text Detection and Localization in Complex Scene Images using Constrained AdaBoost Algorithm

ICDAR '09 Proceedings of the 2009 10th International Conference on Document Analysis and Recognition
Video text detection based on filters and edge features

ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
A two-stage scheme for text detection in video images

Image and Vision Computing
Text detection in images using sparse representation with discriminative dictionaries

Image and Vision Computing
The SHOGUN Machine Learning Toolbox

The Journal of Machine Learning Research
Edge Based Binarization for Video Text Images

ICPR '10 Proceedings of the 2010 20th International Conference on Pattern Recognition
Robust Head-Shoulder Detection by PCA-Based Multilevel HOG-LBP Detector for People Counting

ICPR '10 Proceedings of the 2010 20th International Conference on Pattern Recognition
Toward video semantic search based on a structured folksonomy

Journal of the American Society for Information Science and Technology
ICDAR 2011 Robust Reading Competition - Challenge 1: Reading Text in Born-Digital Images (Web and Email)

ICDAR '11 Proceedings of the 2011 International Conference on Document Analysis and Recognition
Automatic Lecture Video Indexing Using Video OCR Technology

ISM '11 Proceedings of the 2011 IEEE International Symposium on Multimedia
Automatic text detection and tracking in digital video

IEEE Transactions on Image Processing
Localizing and segmenting text in images and videos

IEEE Transactions on Circuits and Systems for Video Technology
An automatic performance evaluation protocol for video text detection algorithms

IEEE Transactions on Circuits and Systems for Video Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Text displayed in a video is an essential part for the high-level semantic information of the video content. Therefore, video text can be used as a valuable source for automated video indexing in digital video libraries. In this paper, we propose a workflow for video text detection and recognition. In the text detection stage, we have developed a fast localization-verification scheme, in which an edge-based multi-scale text detector first identifies potential text candidates with high recall rate. Then, detected candidate text lines are refined by using an image entropy-based filter. Finally, Stroke Width Transform (SWT)- and Support Vector Machine (SVM)-based verification procedures are applied to eliminate the false alarms. For text recognition, we have developed a novel skeleton-based binarization method in order to separate text from complex backgrounds to make it processible for standard OCR (Optical Character Recognition) software. Operability and accuracy of proposed text detection and binarization methods have been evaluated by using publicly available test data sets.