Silent-speech enhancement using body-conducted vocal-tract resonance signals

Authors:
Tatsuya Hirahara;Makoto Otani;Shota Shimizu;Tomoki Toda;Keigo Nakamura;Yoshitaka Nakajima;Kiyohiro Shikano
Affiliations:
Toyama Prefectural University, Department of Intelligent Systems Design Engineering, 5180 Kurokawa, Imizu, Toyama 939-0398, Japan;Toyama Prefectural University, Department of Intelligent Systems Design Engineering, 5180 Kurokawa, Imizu, Toyama 939-0398, Japan;Toyama Prefectural University, Department of Intelligent Systems Design Engineering, 5180 Kurokawa, Imizu, Toyama 939-0398, Japan;Nara Institute of Science and Technology, Graduate School of Information Sciences, 8916-5 Takayama, Ikoma, Nara 630-0192, Japan;Nara Institute of Science and Technology, Graduate School of Information Sciences, 8916-5 Takayama, Ikoma, Nara 630-0192, Japan;Nara Institute of Science and Technology, Graduate School of Information Sciences, 8916-5 Takayama, Ikoma, Nara 630-0192, Japan;Nara Institute of Science and Technology, Graduate School of Information Sciences, 8916-5 Takayama, Ikoma, Nara 630-0192, Japan
Venue:
Speech Communication
Year:
2010

Citing 2
Cited 2

Non-Audible Murmur (NAM) Recognition

IEICE - Transactions on Information and Systems
Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory

IEEE Transactions on Audio, Speech, and Language Processing

Silent speech interfaces

Speech Communication
Alaryngeal Speech Enhancement Based on One-to-Many Eigenvoice Conversion

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

The physical characteristics of weak body-conducted vocal-tract resonance signals called non-audible murmur (NAM) and the acoustic characteristics of three sensors developed for detecting these signals have been investigated. NAM signals attenuate 50dB at 1kHz; this attenuation consists of 30-dB full-range attenuation due to air-to-body transmission loss and -10dB/octave spectral decay due to a sound propagation loss within the body. These characteristics agree with the spectral characteristics of measured NAM signals. The sensors have a sensitivity of between -41 and -58dB [V/Pa] at 1kHz, and the mean signal-to-noise ratio of the detected signals was 15dB. On the basis of these investigations, three types of silent-speech enhancement systems were developed: (1) simple, direct amplification of weak vocal-tract resonance signals using a wired urethane-elastomer NAM microphone, (2) simple, direct amplification using a wireless urethane-elastomer-duplex NAM microphone, and (3) transformation of the weak vocal-tract resonance signals sensed by a soft-silicone NAM microphone into whispered speech using statistical conversion. Field testing of the systems showed that they enable voice impaired people to communicate verbally using body-conducted vocal-tract resonance signals. Listening tests demonstrated that weak body-conducted vocal-tract resonance sounds can be transformed into intelligible whispered speech sounds. Using these systems, people with voice impairments can re-acquire speech communication with less effort.