Improving articulatory feature and phoneme recognition using multitask learning

Authors:
Ramya Rasipuram;Mathew Magimai-Doss
Affiliations:
Idiap Research Institute, Martigny, Switzerland and Ecole Polytechnique Fédérale de Lausanne, Switzerland;Idiap Research Institute, Martigny, Switzerland
Venue:
ICANN'11 Proceedings of the 21th international conference on Artificial neural networks - Volume Part I
Year:
2011

Citing 5
Cited 0

Multitask Learning

Machine Learning - Special issue on inductive transfer
Articulatory feature recognition using dynamic Bayesian networks

Computer Speech and Language
Speaker-independent phoneme alignment using transition-dependent states

Speech Communication
Transfer learning for tandem ASR feature extraction

MLMI'07 Proceedings of the 4th international conference on Machine learning for multimodal interaction
Analysis of MLP-Based Hierarchical Phoneme Posterior Probability Estimator

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Speech sounds can be characterized by articulatory features. Articulatory features are typically estimated using a set of multilayer perceptrons (MLPs), i.e., a separate MLP is trained for each articulatory feature. In this paper, we investigate multitask learning (MTL) approach for joint estimation of articulatory features with and without phoneme classification as subtask. Our studies show that MTL MLP can estimate articulatory features compactly and efficiently by learning the inter-feature dependencies through a common hidden layer representation. Furthermore, adding phoneme as subtask while estimating articulatory features improves both articulatory feature estimation and phoneme recognition. On TIMIT phoneme recognition task, articulatory feature posterior probabilities obtained by MTL MLP achieve a phoneme recognition accuracy of 73.2%, while the phoneme posterior probabilities achieve an accuracy of 74.0%.