Maximum likelihood discriminant feature spaces

Authors:
G. Saon;M. Padmanabhan;R. Gopinath;S. Chen
Affiliations:
IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA;-;-;-
Venue:
ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 02
Year:
2000

Citing 0
Cited 21

Large-Vocabulary Speech Recognition Algorithms

Computer
Multimodal oriented discriminant analysis

ICML '05 Proceedings of the 22nd international conference on Machine learning
The common vector approach and its comparison with other subspace methods in case of sufficient data

Computer Speech and Language
Automatic speech recognition and speech variability: A review

Speech Communication
Extractive spoken document summarization for information retrieval

Pattern Recognition Letters
The application of hidden Markov models in speech recognition

Foundations and Trends in Signal Processing
Multi-stream Fusion for Speaker Classification

Speaker Classification I
A Comparative Study of Probabilistic Ranking Models for Chinese Spoken Document Summarization

ACM Transactions on Asian Language Information Processing (TALIP)
Linear Discriminant Analysis Using a Generalized Mean of Class Covariances and Its Application to Speech Recognition

IEICE - Transactions on Information and Systems
Training data selection for improving discriminative training of acoustic models

Pattern Recognition Letters
Speech recognition using augmented conditional random fields

IEEE Transactions on Audio, Speech, and Language Processing
SVM decision boundary based discriminative subspace induction

Pattern Recognition
Two-dimensional heteroscedastic discriminant analysis for facial gender classification

ICIC'09 Proceedings of the 5th international conference on Emerging intelligent computing technology and applications
Contextual invariant-integration features for improved speaker-independent speech recognition

Speech Communication
Combining formant frequency based on variable order LPC coding with acoustic features for TIMIT phone recognition

International Journal of Speech Technology
Trends and advances in speech recognition

IBM Journal of Research and Development
Advances in mandarin broadcast speech transcription at IBM under the DARPA GALE program

ISCSLP'06 Proceedings of the 5th international conference on Chinese Spoken Language Processing
The 2004 ICSI-SRI-UW meeting recognition system

MLMI'04 Proceedings of the First international conference on Machine Learning for Multimodal Interaction
Discriminative feature extraction for speech recognition using continuous output codes

Pattern Recognition Letters
Leveraging relevance cues for language modeling in speech recognition

Information Processing and Management: an International Journal
A Family of Discriminative Manifold Learning Algorithms and Their Application to Speech Recognition

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Linear discriminant analysis (LDA) is known to be inappropriate for the case of classes with unequal sample covariances. There has been an interest in generalizing LDA to heteroscedastic discriminant analysis (HDA) by removing the equal within-class covariance constraint. This paper presents a new approach to HDA by defining an objective function which maximizes the class discrimination in the projected subspace while ignoring the rejected dimensions. Moreover, we investigate the link between discrimination and the likelihood of the projected samples and show that HDA can be viewed as a constrained ML projection for a full covariance Gaussian model, the constraint being given by the maximization of the projected between-class scatter volume. It is shown that, under diagonal covariance Gaussian modeling constraints, applying a diagonalizing linear transformation (MLLT) to the HDA space results in increased classification accuracy even though HDA alone actually degrades the recognition performance. Experiments performed on the Switchboard and Voicemail databases show a 10%-13% relative improvement in the word error rate over standard cepstral processing.