Data-oriented methods for grapheme-to-phoneme conversion

  • Authors:
  • Antal van den Bosch;Walter Daelemans

  • Affiliations:
  • Tilburg University, Tilburg;Tilburg University, Tilburg

  • Venue:
  • EACL '93 Proceedings of the sixth conference on European chapter of the Association for Computational Linguistics
  • Year:
  • 1993

Quantified Score

Hi-index 0.00

Visualization

Abstract

It is traditionally assumed that various sources of linguistic knowledge and their interaction should be formalised in order to be able to convert words into their phonemic representations with reasonable accuracy. We show that using supervised learning techniques, based on a corpus of transcribed words, the same and even better performance can be achieved, without explicit modeling of linguistic knowledge.In this paper we present two instances of this approach. A first model implements a variant of instance-based learning, in which a weighed similarity metric and a database of prototypical exemplars are used to predict new mappings. In the second model, grapheme-to-phoneme mappings are looked up in a compressed text-to-speech lexicon (table lookup) enriched with default mappings. We compare performance and accuracy of these approaches to a connectionist (backpropagation) approach and to the linguistic knowledge-based approach.