Tree-Based Covariance Modeling of Hidden Markov Models

Authors:
Ye Tian;Jian-Lai Zhou;Hui Lin;Hui Jiang
Affiliations:
Div. of Speech & Natural Language, Microsoft Corp., Redmond, WA;-;-;-
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2006

Citing 0
Cited 1

Minimum Bayes Risk decoding and system combination based on a recursion for edit distance

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present a tree-based, full covariance hidden Markov modeling technique for automatic speech recognition applications. A multilayered tree is built first to organize all covariance matrices into a hierarchical structure. Kullback-Leibler divergence is used in the tree-building to measure inter-Gaussian distortion and successive splitting is used to construct the multilayer covariance tree. To cope with the data sparseness problem in estimating a full covariance matrix, we interpolate the diagonal covariance matrix of a leaf-node at the bottom of the tree with the full covariance of its parent and ancestors along the path up to the root node. The interpolation coefficients are estimated in the maximum likelihood sense via the EM algorithm. The interpolation is performed in three different parametric forms: 1) inverse covariance matrix, 2) covariance matrix, and 3) off-diagonal terms of the full covariance matrix. The proposed algorithm is tested in three different databases: 1) the DARPA resource management (RM), 2) the switchboard, and 3) a Chinese dictation. In all three databases, we show that the proposed tree-based full covariance modeling consistently performs better than the baseline diagonal covariance modeling. The algorithm outperforms other covariance modeling techniques, including: 1) the semi-tied covariance modeling (STC), 2) heteroscedastic linear discriminant analysis (HLDA), 3) mixtures of inverse covariance (MIC), and 4) direct full covariance modeling