Language model adaptation for medical dictations by automatic phonetics-driven transcript reconstruction

Authors:
Stefan Petrik;Franz Pernkopf
Affiliations:
Graz University of Technology, Graz, Austria;Graz University of Technology, Graz, Austria
Venue:
AIA '08 Proceedings of the 26th IASTED International Conference on Artificial Intelligence and Applications
Year:
2008

Citing 2
Cited 1

Generating training data for medical dictations

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Pattern Recognition and Machine Learning (Information Science and Statistics)

Pattern Recognition and Machine Learning (Information Science and Statistics)

Semantic and phonetic automatic reconstruction of medical dictations

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatic phonetic reconstruction of medical dictations from non-literal and automatically recognized speech transcripts leads to closer-to-literal transcripts for training language models of speech recognizers. In this paper, we introduce an extended alignment method assessing multiple levels of text segmentation and show how open issues like wrong segmentation in the recognized transcript can be resolved. Furthermore, we compare a rule-based text reconstruction approach with an automatic classifier, using the multi-level alignment and a stochastic phonetic similarity measure as features. Experiments show better performance for the rule-based system in terms of Recall and Precision, but superiority of the automatic classifier in terms of language model perplexity. The overall increase in precision compared to the simple system in [1] is between 0.7% and 4.7% absolute without loss in recall.