The persian linguistic based audio-visual data corpus, AVA II, considering coarticulation

  • Authors:
  • Azam Bastanfard;Maryam Fazel;Alireza Abdi Kelishami;Mohammad Aghaahmadi

  • Affiliations:
  • Information Technology Research Group, Department of Engineering, Islamic Azad University Karaj branch, Iran;Islamic Republic of Iran Broadcast University, Tehran, Iran;Department of Electrical, Computer and IT Engineering, Qazvin Islamic Azad University, Qazvin, Iran;Department of Electrical, Computer and IT Engineering, Qazvin Islamic Azad University, Qazvin, Iran

  • Venue:
  • MMM'10 Proceedings of the 16th international conference on Advances in Multimedia Modeling
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Collecting an audio visual data corpus based on the linguistic rules is an unquestionable, must-take step in order to conduct major research in multimedia fields as AVSR, lip synchronization and visual speech synthesis. Building up a reliable data corpus where it covers all phonemes in all phonemic combinations of a language is a difficult and time consuming task. To partially deal with this problem, in this research, vc, cv and vcv combinations, instead of the entire possible phonemic combinations were used, where they carry the most language information. This paper gives an indication on the new data corpus, capturing 14 respondents. To better perceive coarticulation effect in speech, continuous speech was considered other than isolated and continuous digits. This makes the collection process a more time and cost-saving one, maintaining the efficiency high.