Silent-speech enhancement using body-conducted vocal-tract resonance signals

  • Authors:
  • Tatsuya Hirahara;Makoto Otani;Shota Shimizu;Tomoki Toda;Keigo Nakamura;Yoshitaka Nakajima;Kiyohiro Shikano

  • Affiliations:
  • Toyama Prefectural University, Department of Intelligent Systems Design Engineering, 5180 Kurokawa, Imizu, Toyama 939-0398, Japan;Toyama Prefectural University, Department of Intelligent Systems Design Engineering, 5180 Kurokawa, Imizu, Toyama 939-0398, Japan;Toyama Prefectural University, Department of Intelligent Systems Design Engineering, 5180 Kurokawa, Imizu, Toyama 939-0398, Japan;Nara Institute of Science and Technology, Graduate School of Information Sciences, 8916-5 Takayama, Ikoma, Nara 630-0192, Japan;Nara Institute of Science and Technology, Graduate School of Information Sciences, 8916-5 Takayama, Ikoma, Nara 630-0192, Japan;Nara Institute of Science and Technology, Graduate School of Information Sciences, 8916-5 Takayama, Ikoma, Nara 630-0192, Japan;Nara Institute of Science and Technology, Graduate School of Information Sciences, 8916-5 Takayama, Ikoma, Nara 630-0192, Japan

  • Venue:
  • Speech Communication
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

The physical characteristics of weak body-conducted vocal-tract resonance signals called non-audible murmur (NAM) and the acoustic characteristics of three sensors developed for detecting these signals have been investigated. NAM signals attenuate 50dB at 1kHz; this attenuation consists of 30-dB full-range attenuation due to air-to-body transmission loss and -10dB/octave spectral decay due to a sound propagation loss within the body. These characteristics agree with the spectral characteristics of measured NAM signals. The sensors have a sensitivity of between -41 and -58dB [V/Pa] at 1kHz, and the mean signal-to-noise ratio of the detected signals was 15dB. On the basis of these investigations, three types of silent-speech enhancement systems were developed: (1) simple, direct amplification of weak vocal-tract resonance signals using a wired urethane-elastomer NAM microphone, (2) simple, direct amplification using a wireless urethane-elastomer-duplex NAM microphone, and (3) transformation of the weak vocal-tract resonance signals sensed by a soft-silicone NAM microphone into whispered speech using statistical conversion. Field testing of the systems showed that they enable voice impaired people to communicate verbally using body-conducted vocal-tract resonance signals. Listening tests demonstrated that weak body-conducted vocal-tract resonance sounds can be transformed into intelligible whispered speech sounds. Using these systems, people with voice impairments can re-acquire speech communication with less effort.