Automatic dialect identification of extemporaneous conversational, Latin American Spanish speech

Authors:
M. A. Zissman;T. P. Gleason;D. M. Rekart;B. L. Losiewicz
Affiliations:
Lincoln Lab., MIT, Lexington, MA, USA;-;-;-
Venue:
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
Year:
1996

Citing 0
Cited 6

Multidialectal Spanish acoustic modeling for speech recognition

Speech Communication
Spoken Arabic dialect identification using phonotactic modeling

Semitic '09 Proceedings of the EACL 2009 Workshop on Computational Approaches to Semitic Languages
Main dialect identification in Mainland China, Hong Kong and Taiwan

CCBR'11 Proceedings of the 6th Chinese conference on Biometric recognition
Human and computer recognition of regional accents and ethnic groups from British English speech

Computer Speech and Language
Native vs. non-native accent identification using Japanese spoken telephone numbers

Speech Communication
Characterizing Phonetic Transformations and Acoustic Differences Across English Dialects

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

A dialect identification technique is described that takes as input extemporaneous, conversational speech spoken in Latin American Spanish and produces as output a hypothesis of the dialect. The system has been trained to recognize Cuban and Peruvian dialects of Spanish, but could be extended easily to other dialects (and languages) as well. Building on our experience in automatic language identification, the dialect-ID system uses an English phone recognizer trained on the TIMIT corpus to tokenize training speech spoken in each Spanish dialect. Phonotactic language models generated from this tokenized training speech are used during testing to compute dialect likelihoods for each unknown message. This system has an error rate of 16% on the Cuban/Peruvian two-alternative forced-choice test. We introduce the new "Miami" Latin American Spanish speech corpus that is capable of supporting our research efforts into the future.