Online Learning Mechanisms for Bayesian Models of Word Segmentation

Authors:
Lisa Pearl;Sharon Goldwater;Mark Steyvers
Affiliations:
Department of Cognitive Sciences, University of California, Irvine, USA 92697-5100;School of Informatics, University of Edinburgh, Edinburgh, UK;Department of Cognitive Sciences, University of California, Irvine, USA 92697-5100
Venue:
Research on Language and Computation
Year:
2010

Citing 4
Cited 2

An Efficient, Probabilistically Sound Algorithm for Segmentation andWord Discovery

Machine Learning - Special issue on natural language learning
Nonparametric bayesian models of lexical acquisition

Nonparametric bayesian models of lexical acquisition
Bootstrap voting experts

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Decayed MCMC iltering

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence

Testing the robustness of online word segmentation: effects of linguistic diversity and phonetic variation

CMCL '11 Proceedings of the 2nd Workshop on Cognitive Modeling and Computational Linguistics
Using rejuvenation to improve particle filtering for Bayesian word segmentation

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

In recent years, Bayesian models have become increasingly popular as a way of understanding human cognition. Ideal learner Bayesian models assume that cognition can be usefully understood as optimal behavior under uncertainty, a hypothesis that has been supported by a number of modeling studies across various domains (e.g., Griffiths and Tenenbaum, Cognitive Psychology, 51, 354---384, 2005; Xu and Tenenbaum, Psychological Review, 114, 245---272, 2007). The models in these studies aim to explain why humans behave as they do given the task and data they encounter, but typically avoid some questions addressed by more traditional psychological models, such as how the observed behavior is produced given constraints on memory and processing. Here, we use the task of word segmentation as a case study for investigating these questions within a Bayesian framework. We consider some limitations of the infant learner, and develop several online learning algorithms that take these limitations into account. Each algorithm can be viewed as a different method of approximating the same ideal learner. When tested on corpora of English child-directed speech, we find that the constrained learner's behavior depends non-trivially on how the learner's limitations are implemented. Interestingly, sometimes biases that are helpful to an ideal learner hinder a constrained learner, and in a few cases, constrained learners perform equivalently or better than the ideal learner. This suggests that the transition from a computational-level solution for acquisition to an algorithmic-level one is not straightforward.