Word spotting: indexing handwritten manuscripts
Intelligent multimedia information retrieval
A System for Interpretation of Line Drawings
IEEE Transactions on Pattern Analysis and Machine Intelligence
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
The Detection of Duplicates in Document Image Databases
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
An Approach to Word Image Matching Based on Weighted Hausforff Distance
ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
Features for Word Spotting in Historical Manuscripts
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
A new algorithm for removing noisy borders from monochromatic documents
Proceedings of the 2004 ACM symposium on Applied computing
Text image matching without language model using a Hausdorff distance
Information Processing and Management: an International Journal
Text line detection in handwritten documents
Pattern Recognition
Robust image based document comparison using attributed relational graphs
SPPRA '08 Proceedings of the Fifth IASTED International Conference on Signal Processing, Pattern Recognition and Applications
User-assisted alignment of Arabic historical manuscripts
Proceedings of the 2011 Workshop on Historical Document Imaging and Processing
Automatic keyword extraction from historical document images
DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
Word spotting application in historical mongolian document images
ICIC'13 Proceedings of the 9th international conference on Intelligent Computing Theories
Hi-index | 0.00 |
In this paper, we propose a novel segmentation-free approach for keyword search in historical typewritten documents combining image preprocessing, synthetic data creation, word spotting and user's feedback technologies. Our aim is to search for keywords typed by the user in a large collection of digitized typewritten historical documents. The proposed method is based on: (i) image preprocessing for image binarization and enhancement, noisy border and frame removal, orientation and skew correction; (ii) creation of synthetic image words from keywords typed by the user; (iii) word segmentation using dynamic parameters; (iv) efficient feature extraction for each image word and (v) a retrieval procedure that is optimized by user's feedback. Experimental results prove the efficiency of the proposed approach.