Separability versus prototypicality in handwritten word-image retrieval

Authors:
Jean-Paul Van Oosten;Lambert Schomaker
Affiliations:
-;-
Venue:
Pattern Recognition
Year:
2014

Citing 16
Cited 0

A training algorithm for optimal margin classifiers

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Improving retrieval performance by relevance feedback

Readings in information retrieval
Using Pen-Based Outlines for Object-Based Annotation and Image-Based Queries

VISUAL '99 Proceedings of the Third International Conference on Visual Information and Information Systems
Handwritten Sentence Recognition

ICPR '00 Proceedings of the International Conference on Pattern Recognition - Volume 3
Recognition of Cursive Roman Handwriting - Past, Present and Future

ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
N-Gram Language Models for Offline Handwritten Text Recognition

IWFHR '04 Proceedings of the Ninth International Workshop on Frontiers in Handwriting Recognition
Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)

Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)
Memory-Based Language Processing (Studies in Natural Language Processing)

Memory-Based Language Processing (Studies in Natural Language Processing)
Online Handwritten Shape Recognition Using Segmental Hidden Markov Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
A nearest-neighbor approach to relevance feedback in content based image retrieval

Proceedings of the 6th ACM international conference on Image and video retrieval
Image retrieval: Ideas, influences, and trends of the new age

ACM Computing Surveys (CSUR)
Handwritten-Word Spotting Using Biologically Inspired Features

IEEE Transactions on Pattern Analysis and Machine Intelligence
Improving Bag-of-Features for Large Scale Image Search

International Journal of Computer Vision
Segmental K-means learning with mixture distribution for HMM based handwriting recognition

PReMI'11 Proceedings of the 4th international conference on Pattern recognition and machine intelligence
Statistical Machine Translation as a Language Model for Handwriting Recognition

ICFHR '12 Proceedings of the 2012 International Conference on Frontiers in Handwriting Recognition

Quantified Score

Hi-index	0.01

Visualization

Abstract

Hit lists are at the core of retrieval systems. The top ranks are important, especially if user feedback is used to train the system. Analysis of hit lists revealed counter-intuitive instances in the top ranks for good classifiers. In this study, we propose that two functions need to be optimised: (a) in order to reduce a massive set of instances to a likely subset among ten thousand or more classes, separability is required. However, the results need to be intuitive after ranking, reflecting (b) the prototypicality of instances. By optimising these requirements sequentially, the number of distracting images is strongly reduced, followed by nearest-centroid based instance ranking that retains an intuitive (low-edit distance) ranking. We show that in handwritten word-image retrieval, precision improvements of up to 35 percentage points can be achieved, yielding up to 100% top hit precision and 99% top-7 precision in data sets with 84000 instances, while maintaining high recall performances. The method is conveniently implemented in a massive scale, continuously trainable retrieval engine, Monk.