State-transition interpolation and MAP adaptation for HMM-based dysarthric speech recognition

  • Authors:
  • Harsh Vardhan Sharma;Mark Hasegawa-Johnson

  • Affiliations:
  • Beckman Institute, Urbana, IL;Beckman Institute, Urbana, IL

  • Venue:
  • SLPAT '10 Proceedings of the NAACL HLT 2010 Workshop on Speech and Language Processing for Assistive Technologies
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes the results of our experiments in building speaker-adaptive recognizers for talkers with spastic dysarthria. We study two modifications -- (a) MAP adaptation of speaker-independent systems trained on normal speech and, (b) using a transition probability matrix that is a linear interpolation between fully ergodic and (exclusively) left-to-right structures, for both speaker-dependent and speaker-adapted systems. The experiments indicate that (1) for speaker-dependent systems, left-to-right HMMs have lower word error rate than transition-interpolated HMMs, (2) adapting all parameters other than transition probabilities results in the highest recognition accuracy compared to adapting any subset of these parameters or adapting all parameters including transition probabilities, (3) performing both transition-interpolation and adaptation gives higher word error rate than performing adaptation alone and, (4) dysarthria severity is not a sufficient indicator of the relative performance of speaker-dependent and speaker-adapted systems.