Phone-segments based language identification for Spanish, Basque and English

  • Authors:
  • Víctor Guijarrubia;M. Inés Torres

  • Affiliations:
  • Departamento de Electricidad y Electrónica, Universidad del País Vasco, Bilbao, Spain;Departamento de Electricidad y Electrónica, Universidad del País Vasco, Bilbao, Spain

  • Venue:
  • CIARP'07 Proceedings of the Congress on pattern recognition 12th Iberoamerican conference on Progress in pattern recognition, image analysis and applications
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a series of language identification (LID) experiments for Spanish, Basque and English. Spanish and Basque are both official languages in the Basque Country, a region located in northern Spain. We focused our research on some techniques based on phone decoding. We propose the use of phone segments as decoding units instead of just phones. We describe a simple procedure to obtain a set of phone segments that typically appear in the languages involved. In comparison with similar techniques that do not rely on phone segments, the choice of these segments as decoding units yields a remarkable improvement in terms of LID accuracy: from 93.02% using phones to 98.32% using phone segments, when applied to trilingual read speech.