DAS '10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
Document: a useful level for facing noisy data
AND '10 Proceedings of the fourth workshop on Analytics for noisy unstructured text data
Data mining medieval documents by word spotting
Proceedings of the 2011 Workshop on Historical Document Imaging and Processing
IMPACT: centre of competence in text digitisation
Proceedings of the 2011 Workshop on Historical Document Imaging and Processing
An experimental workflow development platform for historical document digitisation and analysis
Proceedings of the 2011 Workshop on Historical Document Imaging and Processing
Hi-index | 0.00 |
The aim of this work is to propose a new approach to the recognition of historical texts by providing an adaptive mechanism that automatically tunes itself to a specific book. The system is based on clustering together all the similar words in a book/text and simultaneously handling entire class. The paper describes the architecture of such a system and new algorithms that have been developed for robust word image comparison (including registration, optical flow based distortion compensation, and adaptive binarization). Results for a large dataset are presented as well. Over 23% recognition improvement is demonstrated.