A posterior unionmodel with applications to robust speech and speaker recognition

Authors:
Ji Ming;Jie Lin;F. Jack Smith
Affiliations:
School of Computer Science, Queen's University Belfast, Belfast, UK;School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China;School of Computer Science, Queen's University Belfast, Belfast, UK
Venue:
EURASIP Journal on Applied Signal Processing
Year:
2006

Citing 7
Cited 2

Speaker identification and verification using Gaussian mixture speaker models

Speech Communication
Localization and selection of speaker-specific information with statistical modeling

Speech Communication - Speaker recognition and its commercial and forensic applications
Multi-stream adaptive evidence combination for noise robust ASR

Speech Communication - Special issue on noise robust ASR
Robust automatic speech recognition with missing and unreliable acoustic data

Speech Communication
Sub-Band Based Recognition of Noisy Speech

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
The effects of handset variability on speaker recognition performance: experiments on the Switchboard corpus

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
Corpora for the evaluation of speaker recognition systems

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 02

EM detection of common origin of multi-modal cues

Proceedings of the 8th international conference on Multimodal interfaces
Combining missing-feature theory, speech enhancement, and speaker-dependent/-independent modeling for speech separation

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper investigates speech and speaker recognition involving partial feature corruption, assuming unknown, time-varying noise characteristics. The probabilistic union model is extended from a conditional-probability formulation to a posterior-probability formulation as an improved solution to the problem. The new formulation allows the order of the model to be optimized for every single frame, thereby enhancing the capability of the model for dealing with nonstationary noise corruption. The new formulation also allows the model to be readily incorporated into a Gaussian mixture model (GMM) for speaker recognition. Experiments have been conducted on two databases: TIDIGITS and SPIDRE, for speech recognition and speaker identification. Both databases are subject to unknown, time-varying band-selective corruption. The results have demonstrated the improved robustness for the new model.