Re-engineering letter-to-sound rules

Authors:
Martin Jansche
Affiliations:
The Ohio State University, Columbus, OH
Venue:
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Year:
2001

Citing 12
Cited 1

A statistical approach to machine translation

Computational Linguistics
C4.5: programs for machine learning

C4.5: programs for machine learning
Regular models of phonological rule systems

Computational Linguistics - Special issue on computational phonology
A design principles of a weighted finite-state transducer library

Theoretical Computer Science - Special issue on implementing automata
Multilingual Text-to-Speech Synthesis

Multilingual Text-to-Speech Synthesis
Learning Subsequential Transducers for Pattern Recognition Interpretation Tasks

IEEE Transactions on Pattern Analysis and Machine Intelligence
Compressed Storage of Sparse Finite-State Transducers

WIA '99 Revised Papers from the 4th International Workshop on Automata Implementation
Data-oriented methods for grapheme-to-phoneme conversion

EACL '93 Proceedings of the sixth conference on European chapter of the Association for Computational Linguistics
Automatic induction of finite state transducers for simple phonological rules

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Compilation of weighted finite-state transducers from decision trees

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
An efficient compiler for weighted rewrite rules

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
A Computational Theory of Writing Systems (Studies in Natural Language Processing)

A Computational Theory of Writing Systems (Studies in Natural Language Processing)

Learning Local Transductions Is Hard

Journal of Logic, Language and Information

Quantified Score

Hi-index	0.00

Visualization

Abstract

Using finite-state automata for the text analysis component in a text-to-speech system is problematic in several respects: the rewrite rules from which the automata are compiled are difficult to write and maintain, and the resulting automata can become very large and therefore inefficient. Converting the knowledge represented explicitly in rewrite rules into a more efficient format is difficult. We take an indirect route, learning an efficient decision tree representation from data and tapping information contained in existing rewrite rules, which increases performance compared to learning exclusively from a pronunciation lexicon.