Printed Arabic text database (PATDB) for research and benchmarking

  • Authors:
  • Amin G. Al-Hashim;Sabri A. Mahmoud

  • Affiliations:
  • Department of Information and Computer Science, King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia;Department of Information and Computer Science, King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia

  • Venue:
  • ACE'10 Proceedings of the 9th WSEAS international conference on Applications of computer engineering
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

This paper presents the details of a comprehensive database of Printed Arabic text for Arabic text recognition research. It consists of scanned images of different forms of Arabic printed text (viz. book chapters, advertisements, magazines, newspapers, and reports) scanned with 200, 300, and 600 dpi resolutions. A total of 6954 pages are scanned. The database may be utilized by Arabic printed text recognition research community. It may be used as a benchmark database where researchers can evaluate their algorithms and results compared with published work of other researchers using the same database. To the best of our knowledge, there is no public comprehensive printed Arabic text database that is freely available. Hence, this database may address this deficiency in Arabic printed text recognition research. This database will be made freely available to interested researchers.