Text graphic separation in Indian newspapers

  • Authors:
  • Ritu Garg;Anukriti Bansal;Santanu Chaudhury;Sumantra Dutta Roy

  • Affiliations:
  • IIT Delhi, India;IIT Delhi, India;IIT Delhi, India;IIT Delhi, India

  • Venue:
  • Proceedings of the 4th International Workshop on Multilingual OCR
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Digitization of newspaper article is important for registering historical events. Layout analysis of Indian newspaper is a challenging task due to the presence of different font size, font styles and random placement of text and non-text regions. In this paper we propose a novel framework for learning optimal parameters for text graphic separation in the presence of complex layouts. The learning problem has been formulated as an optimization problem using EM algorithm to learn optimal parameters depending on the nature of the document content.