Experiments for the selection of sub-word units in the Basque context for semantic tasks

  • Authors:
  • Nora Barroso;Karmele López De Ipiña;Carmen Hernández;Aitzol Ezeiza;Manuel Graña

  • Affiliations:
  • Irunweb Enterprise, Irun, Spain 20303;Grupo de Inteligencia Computacional, UPV/EHU, Donostia, Spain 20008;Grupo de Inteligencia Computacional, UPV/EHU, Donostia, Spain 20008;Grupo de Inteligencia Computacional, UPV/EHU, Donostia, Spain 20008;Grupo de Inteligencia Computacional, UPV/EHU, Donostia, Spain 20008

  • Venue:
  • International Journal of Speech Technology
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The long term goal of our project is the development of robust ASR systems in the Basque context where coexist French, Spanish and Basque (a minority language). The development of ASR systems involves dealing with issues such as Acoustic Phonetic Decoding (APD), Language Modelling (LM) or the development of appropriate Language Resources (LR). Thus, these applications are generally very language-dependent and require very large resources. This work is focused on the selection of appropriate sub-word units with under-resourced and noisy conditions. Nowadays, in particular, the work is oriented to Basque Broadcast News (BN) due to the interest of digital mass-media as the trilingual Infozazpi radio (situated in French Basque Country). Thus, in order to decrease the negative impact that the lack of resources has in this issue we apply several data optimization methodologies based on Matrix Covariance Estimation and Ontology-based approaches.