Universal attribute characterization of spoken languages for automatic spoken language recognition

Authors:
Sabato Marco Siniscalchi;Jeremy Reed;TorbjøRn Svendsen;Chin-Hui Lee
Affiliations:
Faculty of Engineering and Architecture, Kore University of Enna, Cittadella Universitaria, Enna, Sicily, Italy;Georgia Tech Research Institute, Georgia Institute of Technology, Atlanta, GA 30332, USA;Department of Electronics and Telecommunications, Norwegian University of Science and Technology, 7491 Trondheim, Norway;School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA
Venue:
Computer Speech and Language
Year:
2013

Citing 8
Cited 1

Fundamentals of speech recognition

Fundamentals of speech recognition
Automatic Speech and Speaker Recognition: Advanced Topics

Automatic Speech and Speaker Recognition: Advanced Topics
The SMART Retrieval System—Experiments in Automatic Document Processing

The SMART Retrieval System—Experiments in Automatic Document Processing
A maximal figure-of-merit (MFoM)-learning approach to robust classifier design for text categorization

ACM Transactions on Information Systems (TOIS)
Automatic language recognition using acoustic features

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
Front-End Factor Analysis for Speaker Verification

IEEE Transactions on Audio, Speech, and Language Processing
A Vector Space Modeling Approach to Spoken Language Identification

IEEE Transactions on Audio, Speech, and Language Processing
Experiments on Cross-Language Attribute Detection and Phone Recognition With Minimal Target-Specific Training Data

IEEE Transactions on Audio, Speech, and Language Processing

Automatic speech recognition for under-resourced languages: A survey

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a novel universal acoustic characterization approach to spoken language recognition (LRE). The key idea is to describe any spoken language with a common set of fundamental units that can be defined ''universally'' across all spoken languages. In this study, speech attributes, such as manner and place of articulation, are chosen to form this unit inventory and used to build a set of language-universal attribute models with data-driven modeling techniques. The vector space modeling approach to LRE is adopted, where a spoken utterance is first decoded into a sequence of attributes independently of its language. Then, a feature vector is generated by using co-occurrence statistics of manner or place units, and the final LRE decision is implemented with a vector space language classifier. Several architectural configurations will be studied, and it will be shown that best performance is attained using a maximal figure-of-merit language classifier. Experimental evidence not only demonstrates the feasibility of the proposed techniques, but it also shows that the proposed technique attains comparable performance to standard approaches on the LRE tasks investigated in this work when the same experimental conditions are adopted.