Language model adaptation for medical dictations by automatic phonetics-driven transcript reconstruction

  • Authors:
  • Stefan Petrik;Franz Pernkopf

  • Affiliations:
  • Graz University of Technology, Graz, Austria;Graz University of Technology, Graz, Austria

  • Venue:
  • AIA '08 Proceedings of the 26th IASTED International Conference on Artificial Intelligence and Applications
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automatic phonetic reconstruction of medical dictations from non-literal and automatically recognized speech transcripts leads to closer-to-literal transcripts for training language models of speech recognizers. In this paper, we introduce an extended alignment method assessing multiple levels of text segmentation and show how open issues like wrong segmentation in the recognized transcript can be resolved. Furthermore, we compare a rule-based text reconstruction approach with an automatic classifier, using the multi-level alignment and a stochastic phonetic similarity measure as features. Experiments show better performance for the rule-based system in terms of Recall and Precision, but superiority of the automatic classifier in terms of language model perplexity. The overall increase in precision compared to the simple system in [1] is between 0.7% and 4.7% absolute without loss in recall.