Learning phonological rule probabilities from speech corpora with exploratory computational phonology

Authors:
Gary Tajchman;Daniel Jurafsky;Eric Fosler
Affiliations:
University of California at Berkeley;University of California at Berkeley;University of California at Berkeley
Venue:
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Year:
1995

Citing 3
Cited 1

The acquisition of stress: a data-oriented approach

Computational Linguistics - Special issue on computational phonology
Phonological structures for speech recognition

Phonological structures for speech recognition
Lexical modeling in a speaker independent speech understanding system

Lexical modeling in a speaker independent speech understanding system

Morpheme-Based Modeling of Pronunciation Variation for Large Vocabulary Continuous Speech Recognition in Korean

IEICE - Transactions on Information and Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents an algorithm for learning the probabilities of optional phonological rules from corpora. The algorithm is based on using a speech recognition system to discover the surface pronunciations of words in speech corpora; using an automatic system obviates expensive phonetic labeling by hand. We describe the details of our algorithm and show the probabilities the system has learned for ten common phonological rules which model reductions and coarticulation effects. These probabilities were derived from a corpus of 7203 sentences of read speech from the Wall Street Journal, and are shown to be a reasonably close match to probabilities from phonetically hand-transcribed data (TIMIT). Finally, we analyze the probability differences between rule use in male versus female speech, and suggest that the differences are caused by differing average rates of speech.