Automatic induction of finite state transducers for simple phonological rules

Authors:
Daniel Gildea;Daniel Jurafsky
Affiliations:
University of California at Berkeley;University of California at Berkeley
Venue:
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Year:
1995

Citing 3
Cited 4

The String-to-String Correction Problem

Journal of the ACM (JACM)
Learning Subsequential Transducers for Pattern Recognition Interpretation Tasks

IEEE Transactions on Pattern Analysis and Machine Intelligence
A discovery procedure for certain phonological rules

ACL '84 Proceedings of the 10th International Conference on Computational Linguistics and 22nd annual meeting on Association for Computational Linguistics

Compilation of weighted finite-state transducers from decision trees

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Re-engineering letter-to-sound rules

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Evolving finite state transducers: some initial explorations

EuroGP'03 Proceedings of the 6th European conference on Genetic programming
Classification Method for Learning Morpheme Analysis

Journal of Information Technology Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a method for learning phonological rules from sample pairs of underlying and surface forms, without negative evidence. The learned rules are represented as finite state transducers that accept underlying forms as input and generate surface forms as output. The algorithm for learning them is an extension of the OSTIA algorithm for learning general subsequential finite state transducers. Although OSTIA is capable of learning arbitrary s.f.s.t's in the limit, large dictionaries of actual English pronunciations did not give enough samples to correctly induce phonological rules. We then augmented OSTIA with two kinds of knowledge specific to natural language phonology, biases from "universal grammar". One bias is that underlying phones are often realized as phonetically similar or identical surface phones. The other biases phonological rules to apply across natural phonological classes. The additions helped in learning more compact, accurate, and general transducers than the unmodified OSTIA algorithm. An implementation of the algorithm successfully learns a number of English postlexical rules.