DVHMM: Variable Length Text Recognition Error Model

Authors:
Atsuhiro Takasu;Kenro Aihara
Affiliations:
-;-
Venue:
ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 3 - Volume 3
Year:
2002

Citing 0
Cited 7

Statistical Analysis of Bibliographic Strings for Constructing an Integrated Document Space

ECDL '02 Proceedings of the 6th European Conference on Research and Advanced Technology for Digital Libraries
Bibliographic attribute extraction from erroneous references based on a statistical model

Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries
Quality enhancement in information extraction from scanned documents

Proceedings of the 2006 ACM symposium on Document engineering
An approximate multi-word matching algorithm for robust document retrieval

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Automatic metadata extraction from museum specimen labels

DCMI '08 Proceedings of the 2008 International Conference on Dublin Core and Metadata Applications
A statistical model for flexible string similarity

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
An effective access mechanism to digital interview archives

ECDL'05 Proceedings of the 9th European conference on Research and Advanced Technology for Digital Libraries

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a text recognition error model called the dual variable length output hidden Markov model (DVHMM) and gives a parameter estimation algorithm based on the EM algorithm. Although existing probabilistic error models are limited to substitution (1,1), insertion (1,0), and deletion (0,1) errors, the DVHMM can handle error patterns of any pair (i, j) of lengths including substitution, insertion, and deletion.