Persian Viseme Classification for Developing Visual Speech Training Application

Authors:
Azam Bastanfard;Mohammad Aghaahmadi;Alireza Abdi Kelishami;Maryam Fazel;Maedeh Moghadam
Affiliations:
Computer Engineering Faculty, Islamic Azad University of Karaj, Karaj, Iran;Computer Engineering Faculty, Islamic Azad University of Qazvin, Qazvin, Iran;Computer Engineering Faculty, Islamic Azad University of Qazvin, Qazvin, Iran;Computer Engineering Faculty, Islamic Azad University of Qazvin, Qazvin, Iran;Computer Engineering Faculty, Islamic Azad University of Qazvin, Qazvin, Iran
Venue:
PCM '09 Proceedings of the 10th Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
Year:
2009

Citing 3
Cited 2

Frame Rate and Viseme Analysis for Multimedia Applications toAssist Speechreading

Journal of VLSI Signal Processing Systems - special issue on multimedia signal processing
Visual Speech Synthesis by Morphing Visemes

International Journal of Computer Vision - special issue on learning and vision at the center for biological and computational learning, Massachusetts Institute of Technology
A comprehensive audio-visual corpus for teaching sound persian phoneme articulation

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics

A novel multimedia educational speech therapy system for hearing impaired children

PCM'10 Proceedings of the Advances in multimedia information processing, and 11th Pacific Rim conference on Multimedia: Part II
Clustering Persian viseme using phoneme subspace for developing visual speech application

Multimedia Tools and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Viseme classification and analysis in every language is among the most important preliminaries for conducting various multimedia researches as talking head, lip reading, lip synchronization and computer assisted pronunciation training applications. Viseme classification and analysis is language dependent. For that reason, in different languages and based on the target applications, visemes of a language are classified. Up to date, there has been no such research in Persian language, in that it makes it rather impossible for researches to be conducted in AVSR system or lip synchronization. In this paper, we propose a novel method adopting an image-based approach for grouping visemes in Persian language considering coarticulation effect. For each phoneme, the central frame is selected in several images representing different positions in various syllables. Having obtained eigenlips of each phoneme, we project each viseme on another viseme's eigenspace. Then the weight value as a result of reconstruction is set as the criterion for comparing viseme similarity. The experimental results indicate an ideal precision and robustness of the proposed algorithm.