Cross-lingual English Spanish tonal accent labeling using decision trees and neural networks

  • Authors:
  • David Escudero-Mancebo;Lourdes Aguilar;César González Ferreras;Carlos Vivaracho Pascual;Valentín Cardeñoso-Payo

  • Affiliations:
  • Dpt. of Computer Sciences, Universidad de Valladolid, Spain;Dpt. of Spanish Philology, Universidad Autónoma de Barcelona, Spain;Dpt. of Computer Sciences, Universidad de Valladolid, Spain;Dpt. of Computer Sciences, Universidad de Valladolid, Spain;Dpt. of Computer Sciences, Universidad de Valladolid, Spain

  • Venue:
  • NOLISP'11 Proceedings of the 5th international conference on Advances in nonlinear speech processing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we present an experimental study on how corpus-based automatic prosodic information labeling can be transferred from a source language to a different target language. The Spanish ESMA corpus is used to train models for the identification of the prominent words. Then, the models are used to identify the accented words of the English Boston University Radio News Corpus (BURNC). The inverse process (training the models with English data and testing with the Spanish corpus) is also contrasted with the results obtained in the conventional scenario: training and testing using the same corpus. We got up to 82.7% correct annotation rates in cross-lingual experiments, which contrast slightly with the accuracy obtained in a mono-lingual single speaker scenarios (86.6% for Spanish and 80.5% for English). Speaker independent monolingual recognition experiments have been also performed with the BURNC corpus, leading to cross-speakers results that go from 69.3% to 84.2% recognition rates. As these results are comparable to the ones obtained in the cross-lingual scenario we conclude that the new approach we defend has to face up with similar challenges as the ones presented in speaker independent scenarios.