A comprehensive audio-visual corpus for teaching sound persian phoneme articulation

Authors:
Azam Bastanfard;Maryam Fazel;Alireza Abdi Kelishami;Mohammad Aghaahmadi
Affiliations:
IRIB University, Tehran, Iran;IRIB University, Tehran, Iran;Department of Electrical, Computer and IT Engineering, Qazvin Azad University, Qazvin, Iran;Department of Electrical, Computer and IT Engineering, Qazvin Azad University, Qazvin, Iran
Venue:
SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Year:
2009

Citing 7
Cited 4

Visual Speech Synthesis by Morphing Visemes

International Journal of Computer Vision - special issue on learning and vision at the center for biological and computational learning, Massachusetts Institute of Technology
The M2VTS Multimodal Face Database (Release 1.00)

AVBPA '97 Proceedings of the First International Conference on Audio- and Video-Based Biometric Person Authentication
Construction of Audio-Visual Speech Corpus Using Motion-Capture System and Corpus Based Facial Animation

IEICE - Transactions on Information and Systems
Real-time language independent lip synchronization method using a genetic algorithm

Signal Processing - Special section: Multimodal human-computer interfaces
Moving-talker, speaker-independent feature study, and baseline results using the CUAVE multimodal speech corpus

EURASIP Journal on Applied Signal Processing
Audiovisual-to-articulatory inversion

Speech Communication
The BANCA database and evaluation protocol

AVBPA'03 Proceedings of the 4th international conference on Audio- and video-based biometric person authentication

Persian Viseme Classification for Developing Visual Speech Training Application

PCM '09 Proceedings of the 10th Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
A novel multimedia educational speech therapy system for hearing impaired children

PCM'10 Proceedings of the Advances in multimedia information processing, and 11th Pacific Rim conference on Multimedia: Part II
The persian linguistic based audio-visual data corpus, AVA II, considering coarticulation

MMM'10 Proceedings of the 16th international conference on Advances in Multimedia Modeling
Clustering Persian viseme using phoneme subspace for developing visual speech application

Multimedia Tools and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Building an audio-visual data corpus is one significant step in audio-visual research. One of the most challenging tasks in computer science is computer-aided speech therapy and language learning. Developing computer applications for training and rehabilitation of the handicapped and helping the hearing and speaking-impaired by facial speech synthesis are among the most helpful, state-of-the-art roles of computer technology in today's human-machine interacting systems. To date, there have been no audio-visual corpora in Persian language, in that it makes it difficult or even impossible for researchers to carry out studies in the area. This paper gives an indication of the collected Persian audio-visual data corpus. AVA is a comprehensive, systematic collection of both continuous speech and isolated spoken utterances in Persian language. The goal of this project is to facilitate audio-visual research in the language through this data corpus which is available upon request.