A robust video text detection approach using SVM

  • Authors:
  • Yi Cheng Wei;Chang Hong Lin

  • Affiliations:
  • Department of Electronic Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan, ROC;Department of Electronic Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan, ROC

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2012

Quantified Score

Hi-index 12.05

Visualization

Abstract

A new method for detecting text in video images is proposed in this article. Variations in background complexity, font size and color, make detecting text regions in video images a difficult task. A pyramidal scheme is utilized to solve these problems. First, two downsized images are generated by bilinear interpolation from the original image. Then, the gradient difference of each pixel is calculated for three differently sized images, including the original one. Next, three K-means clustering procedures are applied to separate all the pixels of the three gradient difference images into two clusters: text and non-text, separately. The K-means clustering results are then combined to form the text regions. Thereafter, projection profile analysis is applied to the Sobel edge map of each text region to determine the boundaries of candidate text regions. Finally, we identify text candidates through two verification phases. In the first verification phase, we verify the geometrical properties and texture of each text candidate. In the second verification phase, statistical characteristics of the text candidate are computed using a discrete wavelet transform, and then the principal component analysis is further used to reduce the number of dimensions of these features. Next, the optimal decision function of the support vector machine, obtained by sequential minimal optimization, is applied to determine whether the text candidates contain texts or not.