Modularity in inductively-learned word pronunciation systems

Authors:
Antal van den Bosch;Ton Weijters;Walter Daelemans
Affiliations:
Tilburg University, Tilburg, The Netherlands;Eindhoven University of Technology, Eindhoven, The Netherlands;Tilburg University, Tilburg, The Netherlands
Venue:
NeMLaP3/CoNLL '98 Proceedings of the Joint Conferences on New Methods in Language Processing and Computational Natural Language Learning
Year:
1998

Citing 12
Cited 2

Toward memory-based reasoning

Communications of the ACM - Special issue on parallelism
From text to speech: the MITalk system

From text to speech: the MITalk system
Constructing a generalizer superior to NETtalk via a mathematical theory of generalization

Neural Networks
Symbolic and Neural Learning Algorithms: An Experimental Comparison

Machine Learning
C4.5: programs for machine learning

C4.5: programs for machine learning
The acquisition of stress: a data-oriented approach

Computational Linguistics - Special issue on computational phonology
IGTree: Using Trees for Compression and Classification in Lazy LearningAlgorithms

Artificial Intelligence Review - Special issue on lazy learning
Machine Learning

Machine Learning
Induction of Decision Trees

Machine Learning
Data-oriented methods for grapheme-to-phoneme conversion

EACL '93 Proceedings of the sixth conference on European chapter of the Association for Computational Linguistics
A general computational model for word-form recognition and production

ACL '84 Proceedings of the 10th International Conference on Computational Linguistics and 22nd annual meeting on Association for Computational Linguistics
GRAFON: a grapheme-to-phoneme conversion system for Dutch

COLING '88 Proceedings of the 12th conference on Computational linguistics - Volume 1

Representational bias in unsupervised learning of syllable structure

CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Beyond the pipeline: discrete optimization in NLP

CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

In leading morpho-phonological theories and state-of-the-art text-to-speech systems it is assumed that word pronunciation cannot be learned or performed without in-between analyses at several abstraction levels (e.g., morphological, graphemic, phonemic, syllabic, and stress levels). We challenge this assumption for the case of English word pronunciation. Using igtree, an inductive-learning decision-tree algorithms, we train and test three word-pronunciation systems in which the number of abstraction levels (implemented as sequenced modules) is reduced from five, via three, to one. The latter system, classifying letter strings directly as mapping to phonemes with stress markers, yields significantly better generalisation accuracies than the two multi-module systems. Analyses of empirical results indicate that positive utility effects of sequencing modules are outweighed by cascading errors passed on between modules.