Semantic speech recognition in the Basque context Part II: language identification for under-resourced languages

  • Authors:
  • Nora Barroso;Karmele López De Ipiña;Carmen Hernández;Aitzol Ezeiza;Manuel Graña

  • Affiliations:
  • Irunweb Enterprise, Irun, Spain 20303;Grupo de Inteligencia Computacional, Escuela Politecnica, Universidad del País Vasco/Euskal Herriko Unibertsitatea, Donostia, Spain 20008;Grupo de Inteligencia Computacional, Escuela Politecnica, Universidad del País Vasco/Euskal Herriko Unibertsitatea, Donostia, Spain 20008;Grupo de Inteligencia Computacional, Escuela Politecnica, Universidad del País Vasco/Euskal Herriko Unibertsitatea, Donostia, Spain 20008;Grupo de Inteligencia Computacional, Escuela Politecnica, Universidad del País Vasco/Euskal Herriko Unibertsitatea, Donostia, Spain 20008

  • Venue:
  • International Journal of Speech Technology
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes the development of a Language Identification (LID) system oriented to robust Multilingual Speech Recognition in the Basque context where coexist three languages: Basque, Spanish and French. The LID system is integrated in GorUP, a Semantic Speech Recognition system for industrial complex environments described in Part I. The work presents hybrid strategies for LID, based on the selection of system elements by several classifiers (Support Vector Machines and Multilayer Perceptron) and Discriminant Analysis improved with robust regularized covariance matrix estimation methods oriented to under-resourced languages and stochastic methods for speech recognition tasks (Hidden Markov Models and n-grams). The LID tool manages the main elements of the Automatic Speech Recognition system (Acoustic Phonetic Decoder, Language Model and Lexicons).