Word spotting: indexing handwritten manuscripts
Intelligent multimedia information retrieval
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
The Detection of Duplicates in Document Image Databases
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
Keyword Spotting for Cursive Document Retrieval
DIA '97 Proceedings of the 1997 Workshop on Document Image Analysis
An Approach to Word Image Matching Based on Weighted Hausforff Distance
ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
Features for Word Spotting in Historical Manuscripts
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
Introduction to the special issue on finite-state methods in NLP
Computational Linguistics - Special issue on finite-state methods in NLP
Adapting a synonym database to specific domains
RANLPIR '00 Proceedings of the ACL-2000 workshop on Recent advances in natural language processing and information retrieval: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 11
Pattern Recognition, Third Edition
Pattern Recognition, Third Edition
Adaptive degraded document image binarization
Pattern Recognition
Keyword-guided word spotting in historical printed documents using synthetic data and user feedback
International Journal on Document Analysis and Recognition
Automatic table detection in document images
ICAPR'05 Proceedings of the Third international conference on Advances in Pattern Recognition - Volume Part I
Hi-index | 0.00 |
In this paper, we propose an alternative method for accessing the content of Greek historical documents printed during the 17th and 18th centuries by searching words directly in digitized documents based on word spotting, without the use of an optical character recognition engine. We describe a methodology according to which synthetic word images are created from keywords. These images are compared to all the words in the digitized documents while user feedback is used in order to refine the search procedure. In order to improve the efficiency of accessing and searching, we have used natural language processing techniques that comprise (i) a morphological generator for early Modern Greek which provides the users with the ability to search documents using only a word stem and locate all the corresponding inflected word forms and (ii) a synonym dictionary which facilitates access to the semantic context of documents and enriches the results of the search process.