A phonotactic language model for spoken language identification
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
LVCSR-based language identification
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
Hi-index | 0.00 |
This paper presents a series of language identification (LID) experiments for Spanish, Basque and English. Spanish and Basque are both official languages in the Basque Country, a region located in northern Spain. We focused our research on some techniques based on phone decoding. We propose the use of phone segments as decoding units instead of just phones. We describe a simple procedure to obtain a set of phone segments that typically appear in the languages involved. In comparison with similar techniques that do not rely on phone segments, the choice of these segments as decoding units yields a remarkable improvement in terms of LID accuracy: from 93.02% using phones to 98.32% using phone segments, when applied to trilingual read speech.