Research on the distal supervised learning model of speech inversion

Authors:
Ying Chen;Shaobai Zhang
Affiliations:
Computer Department, Nanjing University of Posts and Telecommunications, Nanjing, Jiangsu, China;Computer Department, Nanjing University of Posts and Telecommunications, Nanjing, Jiangsu, China
Venue:
ICICA'12 Proceedings of the Third international conference on Information Computing and Applications
Year:
2012

Citing 3
Cited 0

Speech recognition based on phonetic features and acoustic landmarks

Speech recognition based on phonetic features and acoustic landmarks
From acoustics to Vocal Tract time functions

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Face active appearance modeling and speech acoustic information to recover articulation

IEEE Transactions on Audio, Speech, and Language Processing - Special issue on multimodal processing in speech-based interactions

Quantified Score

Hi-index	0.00

Visualization

Abstract

To the problem that articulatory information is not readily available in typical speaker-listener situations, a method that estimates such information from the acoustic signal was proposed, namely speech inversion. Distal supervised learning (DSL) was selected as one of machine learning strategies for speech inversion to study. Eight tract variables were used as articulatory information to model speech dynamics, and the experiment's background and theoretical foundation of distal supervised learning also were analyzed. Besides a global optimization approach was proposed and the results when speech signal is parameterized as acoustic parameters (APs) were compared with as mel-frequency cepstral coefficients (MFCCs). The results showed that distal supervised learning has a good estimation performance for tract variables.