Design and implementation of automatic indexing for information retrieval with Arabic documents
Journal of the American Society for Information Science
Offline recognition of omnifont Arabic text using the HMM ToolKit (HTK)
Pattern Recognition Letters
Recognition of off-line printed Arabic text using Hidden Markov Models
Signal Processing
HMM-based system for recognizing words in historical Arabic manuscript
International Journal of Robotics and Automation
Mono-font cursive arabic text recognition using speech recognition system
SSPR'06/SPR'06 Proceedings of the 2006 joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition
Hi-index | 0.00 |
Amongst the obstacles that have played an important role in delaying the character recognition systems for Arabic language as compared to other languages such as Latin and Chinese is the absence of support utilities such as a language corpus and electronic dictionaries. This paper aims to develop a diverse corpus of scanned page images with the corresponding ground-truth text and description files. This data is a TypewRitten Arabic Corpus with Multi-fonts and referred to as TRACOM. TRACOM may also serve as a benchmark for assessing the performance of Arabic text recognition system. The corpus includes data from the following sources: computer-generated documents, newspapers, magazines, books. The document images are coupled with the equivalent text i.e., ground-truth.