Broad phonetic classification using discriminative Bayesian networks

Authors:
Franz Pernkopf;Tuan Van Pham;Jeff A. Bilmes
Affiliations:
Signal Processing and Speech Communication Laboratory, Graz University of Technology, Inffeldgasse 12, A-8010 Graz, Austria;Signal Processing and Speech Communication Laboratory, Graz University of Technology, Inffeldgasse 12, A-8010 Graz, Austria and Faculty of Electronics and Telecommunications, Danang University of ...;Department of Electrical Engineering, University of Washington, Box 352500, Seattle, WA 98195-2500, USA
Venue:
Speech Communication
Year:
2009

Citing 30
Cited 3

Probabilistic reasoning in intelligent systems: networks of plausible inference

Probabilistic reasoning in intelligent systems: networks of plausible inference
Speaker independent phonetic transcription of fluent speech for large vocabulary speech recognition

HLT '89 Proceedings of the workshop on Speech and Natural Language
Theory refinement on Bayesian networks

Proceedings of the seventh conference (1991) on Uncertainty in artificial intelligence
Elements of information theory

Elements of information theory
A Bayesian Method for the Induction of Probabilistic Networks from Data

Machine Learning
Wavelets and subband coding

Wavelets and subband coding
Bayesian Network Classifiers

Machine Learning - Special issue on learning with probabilistic representations
Robust automatic speech recognition with missing and unreliable acoustic data

Speech Communication
Probabilistic Networks and Expert Systems

Probabilistic Networks and Expert Systems
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond

Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Connectionist Speech Recognition: A Hybrid Approach

Connectionist Speech Recognition: A Hybrid Approach
A Tutorial on Support Vector Machines for Pattern Recognition

Data Mining and Knowledge Discovery
Structural extension to logistic regression: discriminative parameter learning of belief net classifiers

Eighteenth national conference on Artificial intelligence
A CELP Variable Rate Speech Codec with Low Average Rate

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Concealment of Lost Speech Packets Using Adaptive Packetization

ICMCS '98 Proceedings of the IEEE International Conference on Multimedia Computing and Systems
Discriminative, generative and imitative learning

Discriminative, generative and imitative learning
Dynamic bayesian networks: representation, inference and learning

Dynamic bayesian networks: representation, inference and learning
Learning Bayesian network classifiers by maximizing conditional likelihood

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Structural Extension to Logistic Regression: Discriminative Parameter Learning of Belief Net Classifiers

Machine Learning
On Discriminative Bayesian Network Classifiers and Logistic Regression

Machine Learning
Learning Bayesian Network Classifiers: Searching in a Space of Partially Directed Acyclic Graphs

Machine Learning
Discriminative versus generative parameter and structure learning of Bayesian network classifiers

ICML '05 Proceedings of the 22nd international conference on Machine learning
A Scoring Function for Learning Bayesian Networks based on Mutual Information and Conditional Independence Tests

The Journal of Machine Learning Research
When discriminative learning of Bayesian network parameters is easy

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Bayesian network classifiers versus selective k-NN classifier

Pattern Recognition
Learning bayesian network structure from massive datasets: the «sparse candidate« algorithm

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Discriminative learning for minimum error classification [patternrecognition]

IEEE Transactions on Signal Processing
A minimum discrimination information approach for hidden Markov modeling

IEEE Transactions on Information Theory
On the relations between modeling approaches for speech recognition

IEEE Transactions on Information Theory
Approximating discrete probability distributions with dependence trees

IEEE Transactions on Information Theory

Algorithms to Automate Estimation of Time Codes for Captioning Digital Media

PReMI '09 Proceedings of the 3rd International Conference on Pattern Recognition and Machine Intelligence
Efficient Heuristics for Discriminative Structure Learning of Bayesian Network Classifiers

The Journal of Machine Learning Research
Large margin learning of Bayesian classifiers based on Gaussian mixture models

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present an approach to broad phonetic classification, defined as mapping acoustic speech frames into broad (or clustered) phonetic categories. Our categories consist of silence, general voiced, general unvoiced, mixed sounds, voiced closure, and plosive release, and are sufficiently rich to allow accurate time-scaling of speech signals to improve their intelligibility in, e.g. voice-mail applications. There are three main aspects to this work. First, in addition to commonly used speech features, we employ acoustic time-scale features based on the intra-scale relationships of the energy from different wavelet subbands. Secondly, we use and compare against discriminatively learned Bayesian networks. By this, we mean Bayesian networks whose structure and/or parameters have been optimized using a discriminative objective function. We utilize a simple order-based greedy heuristic for learning discriminative structure based on mutual information. Given an ordering, we can find the discriminative classifier structure with O(N^q) score evaluations (where q is the maximum number of parents per node). Third, we provide a large assortment of empirical results, including gender dependent/independent experiments on the TIMIT corpus. We evaluate both discriminative and generative parameter learning on both discriminatively and generatively structured Bayesian networks and compare against generatively trained Gaussian mixture models (GMMs), and discriminatively trained neural networks (NNs) and support vector machines (SVMs). Results show that: (i) the combination of time-scale features and mel-frequency cepstral coefficients (MFCCs) provides the best performance; (ii) discriminative learning of Bayesian network classifiers is superior to the generative approaches; (iii) discriminative classifiers (NNs and SVMs) perform better than both discriminatively and generatively trained and structured Bayesian networks; and (iv) the advantages of generative yet discriminatively structured Bayesian network classifiers still hold in the case of missing features while the discriminatively trained NNs and SVMs are unable to deal with such a case. This last result is significant since it suggests that discriminative Bayesian networks are the most appropriate approach when missing features are common.