Communications of the ACM
The nature of statistical learning theory
The nature of statistical learning theory
An Efficient, Probabilistically Sound Algorithm for Segmentation andWord Discovery
Machine Learning - Special issue on natural language learning
Acoustic characteristics of lexical stress in continuous telephone speech
Speech Communication
A statistical model for word discovery in transcribed speech
Computational Linguistics
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Modeling infant word segmentation
CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Hi-index | 0.00 |
In this paper we present a cognitively plausible approach to word segmentation that segments in an online fashion using only local information and a lexicon of previously segmented words. Unlike popular statistical optimization techniques, the learner uses structural information of the input syllables rather than distributional cues to segment words. We develop a memory model for the learner that like a child learner does not recall previously hypothesized words perfectly. The learner attains an F-score of 86.69% in ideal conditions and 85.05% when word recall is unreliable and stress in the input is reduced. These results demonstrate the power that a simple learner can have when paired with appropriate structural constraints on its hypotheses.