Learning bias and phonological-rule induction

Authors:
Daniel Gildea;Daniel Jurafsky
Affiliations:
University of California at Berkeley;University of Colorado at Boulder
Venue:
Computational Linguistics
Year:
1996

Citing 15
Cited 12

Instance-Based Learning Algorithms

Machine Learning
A computational basis for phonology

Advances in neural information processing systems 2
Efficient learning of typical finite automata from random walks

STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
Class-based n-gram models of natural language

Computational Linguistics
Regular models of phonological rule systems

Computational Linguistics - Special issue on computational phonology
The acquisition of stress: a data-oriented approach

Computational Linguistics - Special issue on computational phonology
Computational optimality theory

Computational optimality theory
The String-to-String Correction Problem

Journal of the ACM (JACM)
Computational Phonology: A Constraint-Based Approach

Computational Phonology: A Constraint-Based Approach
Learning Subsequential Transducers for Pattern Recognition Interpretation Tasks

IEEE Transactions on Pattern Analysis and Machine Intelligence
Induction of Decision Trees

Machine Learning
Hidden Markov Model} Induction by Bayesian Model Merging

Advances in Neural Information Processing Systems 5, [NIPS Conference]
One-level phonology: autosegmental representations and rules as finite automata

Computational Linguistics
A discovery procedure for certain phonological rules

ACL '84 Proceedings of the 10th International Conference on Computational Linguistics and 22nd annual meeting on Association for Computational Linguistics
Phonological derivation in optimality theory

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2

Defense of the ansatz for dynamical hierarchies

Artificial Life
A new algorithm for the alignment of phonetic sequences

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Learning Local Transductions Is Hard

Journal of Logic, Language and Information
Measuring similarity between transliterations against noise data

ACM Transactions on Asian Language Information Processing (TALIP)
Speculative plan execution for information gathering

Artificial Intelligence
A Bayesian model of natural language phonology: generating alternations from underlying forms

SigMorPhon '08 Proceedings of the Tenth Meeting of ACL Special Interest Group on Computational Morphology and Phonology
Learning value predictors for the speculative execution of information gathering plans

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Maximum likelihood estimation of feature-based distributions

SIGMORPHON '10 Proceedings of the 11th Meeting of the ACL Special Interest Group on Computational Morphology and Phonology
Formal and empirical grammatical inference

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts of ACL 2011
Investigating the Relationship Between Linguistic Representation and Computation through an Unsupervised Model of Human Morphology Learning

Research on Language and Computation
Bounded copying is subsequential: implications for metathesis and reduplication

SIGMORPHON '12 Proceedings of the Twelfth Meeting of the Special Interest Group on Computational Morphology and Phonology
Discovering the phoneme inventory of an unwritten language: A machine-assisted approach

Speech Communication

Quantified Score

Hi-index	0.02

Visualization

Abstract

A fundamental debate in the machine learning of language has been the role of prior knowledge in the learning process. Purely nativist approaches, such as the Principles and Parameters model, build parameterized linguistic generalizations directly into the learning system. Purely empirical approaches use a general, domain-independent learning rule (Error Back-Propagation, Instance-based Generalization, Minimum Description Length) to learn linguistic generalizations directly from the data.In this paper we suggest that an alternative to the purely nativist or purely empiricist learning paradigms is to represent the prior knowledge of language as a set of abstract learning biases, which guide an empirical inductive learning algorithm. We test our idea by examining the machine learning of simple Sound Pattern of English (SPE)-style phonological rules. We represent phonological rules as finite-state transducers that accept underlying forms as input and generate surface forms as output. We show that OSTIA, a general-purpose transducer induction algorithm, was incapable of learning simple phonological rules like flapping. We then augmented OSTIA with three kinds of learning biases that are specific to natural language phonology, and that are assumed explicitly or implicitly by every theory of phonology: faithfulness (underlying segments tend to be realized similarly on the surface), community (similar segments behave similarly), and context (phonological rules need access to variable in their context). These biases are so fundamental to generative phonology that they are left implicit in many theories. But explicitly modifying the OSTIA algorithm with these biases allowed it to learn more compact, accurate, and general transducers, and our implementation successfully learns a number of rules from English and German. Furthermore, we show that some of the remaining errors in our augmented model are due to implicit biases in the traditional SPE-style rewrite system that are not similarly represented in the transducer formalism, suggesting that while transducers may be formally equivalent to SPE-style rules, they may not have identical evaluation procedures.Because our biases were applied to the learning of very simple SPE-style rules, and to a non-psychologically-motivated and nonprobabilistic theory of purely deterministic transducers, we do not expect that our model as implemented has any practical use as a phonological learning device, nor is it intended as a cognitive model of human learning. Indeed, because of the noise and nondeterminism inherent to linguistic data, we feel strongly that stochastic algorithms for language induction are much more likely to be a fruitful research direction. Our model is rather intended to suggest the kind of biases that may be added to other empiricist induction models, and the way in which they may be added, in order to build a cognitively and computationally plausible learning model for phonological rules.