Experiments on automatic recognition of nonnative Arabic speech

Authors:
Yousef Ajami Alotaibi;Sid-Ahmed Selouani;Douglas O'Shaughnessy
Affiliations:
Computer Engineering Department, King Saud University, Riyadh, Saudi Arabia;Laboratoire de Recherche en Interactivité Homme Système LARIHS, Université de Moncton, New Brunswick, Canada;INRS-Energie-Matériaux-Télécommunications, Université du Québec, Montréal, Canada
Venue:
EURASIP Journal on Audio, Speech, and Music Processing - Scalable Audio-Content Analysis
Year:
2008

Citing 1
Cited 3

An acoustic-phonetic approach in automatic arabic speech recognition

An acoustic-phonetic approach in automatic arabic speech recognition

Study on pharyngeal and uvular consonants in foreign accented Arabic for ASR

Computer Speech and Language
Alternative speech communication system for persons with severe speech disorders

EURASIP Journal on Advances in Signal Processing - Special issue on analysis and signal processing of oesophageal and pathological voices
Adaptation to non-native speech using evolutionary-based discriminative linear transforms

Engineering Applications of Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

The automatic recognition of foreign-accented Arabic speech is a challenging task since it involves a large number of nonnative accents. As well, the nonnative speech data available for training are generally insufficient. Moreover, as compared to other languages, the Arabic language has sparked a relatively small number of research efforts. In this paper, we are concerned with the problem of nonnative speech in a speaker independent, large-vocabulary speech recognition system for modern standard Arabic (MSA). We analyze some major differences at the phonetic level in order to determine which phonemes have a significant part in the recognition performance for both native and nonnative speakers. Special attention is given to specific Arabic phonemes. The performance of an HMM-based Arabic speech recognition system is analyzed with respect to speaker gender and its native origin. The West Point modern standard Arabic database from the language data consortium (LDC) and the hidden Markov Model Toolkit (HTK) are used throughout all experiments. Our study shows that the best performance in the overall phoneme recognition is obtained when nonnative speakers are involved in both training and testing phases. This is not the case when a language model and phonetic lattice networks are incorporated in the system. At the phonetic level, the results show that female nonnative speakers perform better than nonnative male speakers, and that emphatic phonemes yield a significant decrease in performance when they are uttered by both male and female nonnative speakers.