Neural networks and the bias/variance dilemma
Neural Computation
An Efficient, Probabilistically Sound Algorithm for Segmentation andWord Discovery
Machine Learning - Special issue on natural language learning
Tagging English text with a probabilistic model
Computational Linguistics
A statistical model for word discovery in transcribed speech
Computational Linguistics
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
A generative constituent-context model for improved grammar induction
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Distributional phrase structure induction
ConLL '01 Proceedings of the 2001 workshop on Computational Natural Language Learning - Volume 7
Probabilistic context-free grammars for phonology
MPL '02 Proceedings of the ACL-02 workshop on Morphological and phonological learning - Volume 6
Modularity in inductively-learned word pronunciation systems
NeMLaP3/CoNLL '98 Proceedings of the Joint Conferences on New Methods in Language Processing and Computational Natural Language Learning
On the syllabification of phonemes
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Toward a totally unsupervised, language-independent method for the syllabification of written texts
SIGMORPHON '10 Proceedings of the 11th Meeting of the ACL Special Interest Group on Computational Morphology and Phonology
Dual stream speech recognition using articulatory syllable models
International Journal of Speech Technology
Hi-index | 0.00 |
Unsupervised learning algorithms based on Expectation Maximization (EM) are often straightforward to implement and provably converge on a local likelihood maximum. However, these algorithms often do not perform well in practice. Common wisdom holds that they yield poor results because they are overly sensitive to initial parameter values and easily get stuck in local (but not global) maxima. We present a series of experiments indicating that for the task of learning syllable structure, the initial parameter weights are not crucial. Rather, it is the choice of model class itself that makes the difference between successful and unsuccessful learning. We use a language-universal rule-based algorithm to find a good set of parameters, and then train the parameter weights using EM. We achieve word accuracy of 95.9% on German and 97.1% on English, as compared to 97.4% and 98.1% respectively for supervised training.