Robust speech recognition using factorial HMMs for home environments

Authors:
Agnieszka Betkowska;Koichi Shinoda;Sadaoki Furui
Affiliations:
Department of Computer Science, Graduate School of Information Science and Engineering, Tokyo Institute of Technology, Tokyo, Japan;Department of Computer Science, Graduate School of Information Science and Engineering, Tokyo Institute of Technology, Tokyo, Japan;Department of Computer Science, Graduate School of Information Science and Engineering, Tokyo Institute of Technology, Tokyo, Japan
Venue:
EURASIP Journal on Applied Signal Processing
Year:
2007

Citing 3
Cited 1

Factorial Hidden Markov Models

Machine Learning - Special issue on learning with probabilistic representations
Robust automatic speech recognition with missing and unreliable acoustic data

Speech Communication
Spoken Language Processing: A Guide to Theory, Algorithm, and System Development

Spoken Language Processing: A Guide to Theory, Algorithm, and System Development

Multichannel Speech Enhancement Based on Generalized Gamma Prior Distribution with Its Online Adaptive Estimation

IEICE - Transactions on Information and Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We focus on the problem of speech recognition in the presence of nonstationary sudden noise, which is very likely to happen in home environments. As a model compensation method for this problem, we investigated the use of factorial hidden Markov model (FHMM) architecture developed from a clean-speech hidden Markov model (HMM) and a sudden-noise HMM. While in conventional studies this architecture is defined only for static features of the observation vector, we extended it to dynamic features. In addition, we performed home-environment adaptation of FHMMs to the characteristics of a given house. A database recorded by a personal robot called PaPeRo in home environments was used for the evaluation of the proposed method. Isolated word recognition experiments demonstrated the effectiveness of the proposed method under noisy conditions. Home-dependent word FHMMs (HD-FHMMs) reduced the word error rate by 20.5% compared to that of the clean-speech word HMMs.