A tandem algorithm for pitch estimation and voiced speech segregation

  • Authors:
  • Guoning Hu;DeLiang Wang

  • Affiliations:
  • The Ohio State University, Columbus, OH and AOL Truveo Video Search, San Francisco, CA;Department of Computer Science and Engineering and Center for Cognitive Science, The Ohio State University, Columbus, OH

  • Venue:
  • IEEE Transactions on Audio, Speech, and Language Processing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

A lot of effort has been made in computational auditory scene analysis (CASA) to segregate speech from monaural mixtures. The performance of current CASA systems on voiced speech segregation is limited by lacking a robust algorithm for pitch estimation. We propose a tandem algorithm that performs pitch estimation of a target utterance and segregation of voiced portions of target speech jointly and iteratively. This algorithm first obtains a rough estimate of target pitch, and then uses this estimate to segregate target speech using harmonicity and temporal zcontinuity. It then improves both pitch estimation and voiced speech segregation iteratively. Novel methods are proposed for performing segregation with a given pitch estimate and pitch determination with given segregation. Systematic evaluation shows that the tandem algorithm extracts a majority of target speech without including much interference, and it performs substantially better than previous systems for either pitch extraction or voiced speech segregation.