The indexing and retrieval of document images: a survey
Computer Vision and Image Understanding - Special issue on document image understanding and retrieval
ACM Computing Surveys (CSUR)
Information Retrieval
Word Spotting: A New Approach to Indexing Handwriting
CVPR '96 Proceedings of the 1996 Conference on Computer Vision and Pattern Recognition (CVPR '96)
Algorithms for postprocessing OCR results with visual inter-word constraints
ICIP '95 Proceedings of the 1995 International Conference on Image Processing (Vol. 3)-Volume 3 - Volume 3
Object Recognition from Local Scale-Invariant Features
ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
A Weighted Distance Approach to Relevance Feedback
ICPR '00 Proceedings of the International Conference on Pattern Recognition - Volume 4
Holistic Word Recognition for Handwritten Historical Documents
DIAL '04 Proceedings of the First International Workshop on Document Image Analysis for Libraries (DIAL'04)
A New Approach for Relevance Feedback Through Positive and Negative Samples
ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 4 - Volume 04
Learning a Mahalanobis Metric from Equivalence Constraints
The Journal of Machine Learning Research
Separating Style and Content with Bilinear Models
Neural Computation
Word spotting for historical documents
International Journal on Document Analysis and Recognition
Keyword-guided word spotting in historical printed documents using synthetic data and user feedback
International Journal on Document Analysis and Recognition
Matching word images for content-based retrieval from printed document images
International Journal on Document Analysis and Recognition
Robust Recognition of Documents by Fusing Results of Word Clusters
ICDAR '09 Proceedings of the 2009 10th International Conference on Document Analysis and Recognition
Efficient search in document image collections
ACCV'07 Proceedings of the 8th Asian conference on Computer vision - Volume Part I
Retrieval from document image collections
DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
Hi-index | 0.00 |
Matching word images has many applications in document recognition and retrieval systems. Dynamic Time Warping (DTW) is popularly used to estimate the similarity between word images. Word images are represented as sequences of feature vectors, and the cost associated with dynamic programming based alignment is considered as the dissimilarity between them. However, such approaches are computationally costly when compared to fixed length matching schemes. In this paper, we explore systematic methods for identifying appropriate distance metrics for a given database or language. This is achieved by learning query specific distance functions which can be computed online efficiently. We show that a weighted Euclidean distance can outperform DTW for matching word images. This class of distance functions are also ideal for scalability and large scale matching. Our results are validated with mean Average Precision (mAP) on a fully annotated data set of 160K word images. We then show that the learnt distance functions can even be extended to a new database to obtain accurate retrieval.