Retrieval of Ottoman documents
MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Matching ottoman words: an image retrieval approach to historical document indexing
Proceedings of the 6th ACM international conference on Image and video retrieval
Expert Systems with Applications: An International Journal
Handwritten word-spotting using hidden Markov models and universal vocabularies
Pattern Recognition
Ottoman archives explorer: A retrieval system for digital Ottoman archives
Journal on Computing and Cultural Heritage (JOCCH)
Unsupervised writer adaptation of whole-word HMMs with application to word-spotting
Pattern Recognition Letters
Efficient search in document image collections
ACCV'07 Proceedings of the 8th Asian conference on Computer vision - Volume Part I
Lexicon-free handwritten word spotting using character HMMs
Pattern Recognition Letters
Contextual word spotting in historical manuscripts using Markov logic networks
Proceedings of the 2nd International Workshop on Historical Document Imaging and Processing
Statistical script independent word spotting in offline handwritten documents
Pattern Recognition
Hi-index | 0.00 |
Currently an abundance of historical manuscripts, journals, and scientific notes remain largely unaccessible in library archives. Manual transcription and publication of such documents is unlikely, and automatic transcription with high enough accuracy to support a traditional text search is difficult. In this work we describe a lexicon-free system for performing text queries on off-line printed and handwritten Arabic documents. Our segmentation-based approach utilizes gHMMs with a bigram letter transition model, and KPCA/LDA for letter discrimination. The segmentation stage is integrated with inference. We show that our method is robust to varying letter forms, ligatures, and overlaps. Additionally, we find that ignoring letters beyond the adjoining neighbors has little effect on inference and localization, which leads to a significant performance increase over standard dynamic programming. Finally, we discuss an extension to perform batch searches of large word lists for indexing purposes.