Higher-Level Features in Speaker Recognition

  • Authors:
  • Elizabeth Shriberg

  • Affiliations:
  • SRI International, Menlo Park, CA, International Computer Science Institute, Berkeley, CA,

  • Venue:
  • Speaker Classification I
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Higher-level features based on linguistic or long-range information have attracted significant attention in automatic speaker recognition. This article briefly summarizes approaches to using higher-level features for text-independent speaker verification over the last decade. To clarify how each approach uses higher-level information, features are described in terms of their type, temporal span, and reliance on automatic speech recognition for both feature extractionand feature conditioning. A subsequent analysis of higher-level features in a state-of-the-art system illustrates that (1) a higher-level cepstral system outperforms standard systems, (2) a prosodic system shows excellent performance individually and in combination, (3) other higher-level systems provide further gains, and (4) higher-level systems provide increasing relative gains as training data increases. Implications for the general field of speaker classification are discussed.