A Novel Image Text Extraction Method Based on K-Means Clustering

  • Authors:
  • Yan Song;Anan Liu;Lin Pang;Shouxun Lin;Yongdong Zhang;Sheng Tang

  • Affiliations:
  • -;-;-;-;-;-

  • Venue:
  • ICIS '08 Proceedings of the Seventh IEEE/ACIS International Conference on Computer and Information Science (icis 2008)
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Texts in web pages, images and videos contain important clues for information indexing and retrieval. Most existing text extraction methods depend on the language type and text appearance. In this paper, a novel and universal method of image text extraction is proposed. A coarse-to-fine text location method is implemented. Firstly, a multi-scale approach is adopted to locate texts with different font sizes. Secondly, projection profiles are used in location refinement step. Color-based k-means clustering is adopted in text segmentation. Compared to grayscale image which is used in most existing methods, color image is more suitable for segmentation based on clustering. It treats corner-points, edge-points and other points equally so that it solves the problem of handling multilingual text. It is demonstrated in experimental results that best performance is obtained when k is 3. Comparative experimental results on a large number of images show that our method is accurate and robust in various conditions.