Learning phonological rule probabilities from speech corpora with exploratory computational phonology

  • Authors:
  • Gary Tajchman;Daniel Jurafsky;Eric Fosler

  • Affiliations:
  • University of California at Berkeley;University of California at Berkeley;University of California at Berkeley

  • Venue:
  • ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
  • Year:
  • 1995

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents an algorithm for learning the probabilities of optional phonological rules from corpora. The algorithm is based on using a speech recognition system to discover the surface pronunciations of words in speech corpora; using an automatic system obviates expensive phonetic labeling by hand. We describe the details of our algorithm and show the probabilities the system has learned for ten common phonological rules which model reductions and coarticulation effects. These probabilities were derived from a corpus of 7203 sentences of read speech from the Wall Street Journal, and are shown to be a reasonably close match to probabilities from phonetically hand-transcribed data (TIMIT). Finally, we analyze the probability differences between rule use in male versus female speech, and suggest that the differences are caused by differing average rates of speech.