Margin noise removal from printed document images

  • Authors:
  • Soumyadeep Dey;Jayanta Mukhopadhyay;Shamik Sural;Partha Bhowmick

  • Affiliations:
  • Indian Institute Of Technology, Kharagpur;Indian Institute Of Technology, Kharagpur;Indian Institute Of Technology, Kharagpur;Indian Institute Of Technology, Kharagpur

  • Venue:
  • Proceeding of the workshop on Document Analysis and Recognition
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose a technique for removing margin noise (both textual and non-textual noise) from scanned document images. We perform layout analysis to detect words, lines, and paragraphs in the document image. These detected elements are classified into text and non-text components on the basis of their characteristics (size, position, etc.). The geometric properties of the text blocks are sought to detect and remove the margin noise. We evaluate our algorithm on several scanned pages of Bengali literature books.