Extracting amplitude modulations from speech in the time domain

  • Authors:
  • Garreth Prendergast;Sam R. Johnson;Gary G. R. Green

  • Affiliations:
  • York Neuroimaging Centre, University of York, York YO10 5DG, UK and Hull York Medical School, University of York, York, UK;York Neuroimaging Centre, University of York, York YO10 5DG, UK;York Neuroimaging Centre, University of York, York YO10 5DG, UK and Hull York Medical School, University of York, York, UK

  • Venue:
  • Speech Communication
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Natural sounds can be characterised by patterns of changes in loudness (amplitude modulations), and human speech perception studies have focused on the low frequencies contained in the gross temporal structure of speech. Low-pass filtering the temporal envelopes of sub-band filtered speech maintains intelligibility, but it remains unclear how the human auditory system could perform such a modulation domain analysis or even if it does so at all. It is difficult to further manipulate amplitude modulations through frequency-domain filtering to investigate cues the system may use. The current work focuses on a time-domain decomposition of filter output envelopes into pulses of amplitude modulation. The technique demonstrates that signals low-pass filtered in the modulation domain maintain bursts of energy which are comparable to those that can be extracted entirely within the time-domain. This paper presents preliminary work that suggests a time-domain approach, which focuses on the instantaneous features of transient changes in loudness, can be used to study the content of human speech. This approach should be pursued as it allows human speech intelligibility mechanisms to be investigated from a new perspective.