Speech recognition based on phonetic features and acoustic landmarks
Speech recognition based on phonetic features and acoustic landmarks
From acoustics to Vocal Tract time functions
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Face active appearance modeling and speech acoustic information to recover articulation
IEEE Transactions on Audio, Speech, and Language Processing - Special issue on multimodal processing in speech-based interactions
Hi-index | 0.00 |
To the problem that articulatory information is not readily available in typical speaker-listener situations, a method that estimates such information from the acoustic signal was proposed, namely speech inversion. Distal supervised learning (DSL) was selected as one of machine learning strategies for speech inversion to study. Eight tract variables were used as articulatory information to model speech dynamics, and the experiment's background and theoretical foundation of distal supervised learning also were analyzed. Besides a global optimization approach was proposed and the results when speech signal is parameterized as acoustic parameters (APs) were compared with as mel-frequency cepstral coefficients (MFCCs). The results showed that distal supervised learning has a good estimation performance for tract variables.