A multi-scale framework for adaptive binarization of degraded document images
Pattern Recognition
IBN SINA: a database for research on processing and understanding of Arabic manuscripts images
DAS '10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
Proceedings of the 2011 Workshop on Historical Document Imaging and Processing
A synthesised word approach to word retrieval in handwritten documents
Pattern Recognition
A learning framework for the optimization and automation of document binarization methods
Computer Vision and Image Understanding
Learning-based word spotting system for Arabic handwritten documents
Pattern Recognition
Hi-index | 0.00 |
A complete system for preprocessing and word spotting of very old historical document images is presented. Document images are processed for extraction of salient information using a word spotting technique which does not need line and word segmentation and is language independent.A multi-class library of connected components of document text is created based on six features. The spotting is performed using Euclidean distance measure enhanced by rotation and dynamic time wrapping transforms. The method is applied to a dataset from Juma Al Majid Center (Dubai)with promising results. A promising performance of the word spotting technique is obtained using an automatic preprocessing stage. In this stage, using content-level classifiers, accurate stroke pixels are extracted in a robust way. The preprocessed document images are also more legible to the end user and are less costly to archive and transfer.