A Model Selection Criterion for Classification: Application to HMM Topology Optimization
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
Unsupervised learning of acoustic sub-word units
HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Context-dependent alignment models for statistical machine translation
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Hi-index | 0.00 |
Modeling contextual variations of phones is widely accepted as an important aspect of a continuous speech recognition system, and much research has been devoted to finding robust models of context for HMM systems. In particular, decision tree clustering has been used to tie output distributions across pre-defined states, and successive state splitting (SSS) has been used to define parsimonious HMM topologies. We describe a new HMM design algorithm, called maximum likelihood successive state splitting (ML-SSS), that combines advantages of both these approaches. Specifically, an HMM topology is designed using a greedy search for the best temporal and contextual splits using a constrained EM algorithm. In Japanese phone recognition experiments, ML-SSS shows recognition performance gains and training cost reduction over SSS under several training conditions.