Character confidence based on N-best list for keyword spotting in online Chinese handwritten documents

Authors:
Heng Zhang;Da-Han Wang;Cheng-Lin Liu
Affiliations:
-;-;-
Venue:
Pattern Recognition
Year:
2014

Citing 26
Cited 0

Modified Quadratic Discriminant Functions and the Application to Chinese Character Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
Introduction to statistical pattern recognition (2nd ed.)

Introduction to statistical pattern recognition (2nd ed.)
Transforming neural-net output levels to probability distributions

NIPS-3 Proceedings of the 1990 conference on Advances in neural information processing systems 3
Precise Candidate Selection for Large Character Set Recognition by Confidence Evaluation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Lexicon-Driven Segmentation and Recognition of Handwritten Character Strings for Japanese Address Reading

IEEE Transactions on Pattern Analysis and Machine Intelligence
Word Spotting: A New Approach to Indexing Handwriting

CVPR '96 Proceedings of the 1996 Conference on Computer Vision and Pattern Recognition (CVPR '96)
Effects of Classifier Structures and Training Regimes on Integrated Segmentation and Recognition of Handwritten Numeral Strings

IEEE Transactions on Pattern Analysis and Machine Intelligence
Word spotting for historical documents

International Journal on Document Analysis and Recognition
Use of a Confusion Network to Detect and Correct Errors in an On-Line Handwritten Sentence Recognition System

ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 01
Online Handwritten Japanese Character String Recognition Incorporating Geometric Context

ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 01
Off-line recognition of realistic Chinese handwriting using segmentation-free strategy

Pattern Recognition
A robust approach to text line grouping in online handwritten Japanese documents

Pattern Recognition
Robust discriminative keyword spotting for emotionally colored spontaneous speech using bidirectional LSTM networks

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
A probabilistic method for keyword retrieval in handwritten document images

Pattern Recognition
Improving alignments for better confusion networks for combining machine translation systems

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Computational methods for a mathematical theory of evidence

IJCAI'81 Proceedings of the 7th international joint conference on Artificial intelligence - Volume 2
Classifier combination based on confidence transformation

Pattern Recognition
Regularized margin-based conditional log-likelihood loss for prototype learning

Pattern Recognition
HMM-based Word Spotting in Handwritten Documents Using Subword Models

ICPR '10 Proceedings of the 2010 20th International Conference on Pattern Recognition
CASIA Online and Offline Chinese Handwriting Databases

ICDAR '11 Proceedings of the 2011 International Conference on Document Analysis and Recognition
A Novel Word Spotting Method Based on Recurrent Neural Networks

IEEE Transactions on Pattern Analysis and Machine Intelligence
Improved large vocabulary continuous chinese speech recognition by character-based consensus networks

ISCSLP'06 Proceedings of the 5th international conference on Chinese Spoken Language Processing
An approach for real-time recognition of online Chinese handwritten sentences

Pattern Recognition
Handwritten Chinese Text Recognition by Integrating Multiple Contexts

IEEE Transactions on Pattern Analysis and Machine Intelligence
A digital library framework for heterogeneous music collections: from document acquisition to cross-modal interaction

International Journal on Digital Libraries - Focused Issue on Music Digital Libraries
Whole-Book Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence

Quantified Score

Hi-index	0.01

Visualization

Abstract

In keyword spotting from handwritten documents by text query, the word similarity is usually computed by combining character similarities, which are desired to approximate the logarithm of the character probabilities. In this paper, we propose to directly estimate the posterior probability (also called confidence) of candidate characters based on the N-best paths from the candidate segmentation-recognition lattice. On evaluating the candidate segmentation-recognition paths by combining multiple contexts, the scores of the N-best paths are transformed to posterior probabilities using soft-max. The parameter of soft-max (confidence parameter) is estimated from the character confusion network, which is constructed by aligning different paths using a string matching algorithm. The posterior probability of a candidate character is the summation of the probabilities of the paths that pass through the candidate character. We compare the proposed posterior probability estimation method with some reference methods including the word confidence measure and the text line recognition method. Experimental results of keyword spotting on a large database CASIA-OLHWDB of unconstrained online Chinese handwriting demonstrate the effectiveness of the proposed method.