Optimum algorithm to minimize human interactions in sequential Computer Assisted Pattern Recognition

Authors:
Jose Oncina
Affiliations:
Dep. Lenguajes y Sistemas Informáticos, Universidad de Alicante, Spain
Venue:
Pattern Recognition Letters
Year:
2009

Citing 8
Cited 2

Statistical methods for speech recognition

Statistical methods for speech recognition
Target-Text Mediated Interactive Machine Translation

Machine Translation
Application of OSTIA to Machine Translation Tasks

ICGI '94 Proceedings of the Second International Colloquium on Grammatical Inference and Applications
Text prediction for translators

Text prediction for translators
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based models for interactive computer-assisted translation

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Computer Assisted Transcription of Handwritten Text Images

ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02
Computer Assisted Transcription for Ancient Text Images

ICIAR '07 Proceedings of the 4th international conference on Image Analysis and Recognition

A multimodal interactive text generation system

International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction
On the optimal decision rule for sequential interactive structured prediction

Pattern Recognition Letters

Quantified Score

Hi-index	0.10

Visualization

Abstract

Given a Pattern Recognition task, Computer Assisted Pattern Recognition can be viewed as a series of solution proposals made by a computer system, followed by corrections made by a user, until an acceptable solution is found. For this kind of systems, the appropriate measure of performance is the expected number of corrections the user has to make. In the present work we study the special case when the solution proposals have a sequential nature. Some examples of this type of tasks are: language translation, speech transcription and handwriting text transcription. In all these cases the output (the solution proposal) is a sequence of symbols. In this framework it is assumed that the user corrects always the first error found in the proposed solution. As a consequence, the prefix of the proposed solution before the last error correction can be assumed error free in the next iteration. Nowadays, all the techniques in the literature relies in proposing, at each step, the most probable suffix given that a prefix of the ''correct'' output is already known. Usually the computation of the conditional most probable output is an NP-Hard or an undecidable problem (and then we have to apply some approximations) or, in some simple cases, complex dynamic programming techniques should be used (usually some variant of the Viterbi algorithm). In the present work we show that this strategy is not optimum when we are interested in minimizing the number of human interactions. Moreover we describe the optimum strategy that is simpler (and usually faster) to compute.