Facial expression recognition of a speaker using vowel judgment and thermal image processing

  • Authors:
  • Yasunari Yoshitomi;Taro Asada;Kyouhei Shimada;Masayoshi Tabuse

  • Affiliations:
  • Division of Environmental Sciences Graduate School of Life and Environmental Sciences, Kyoto Prefectural University, Shimogamo, Sakyo-ku, Kyoto, Japan 606-8522;Division of Environmental Sciences Graduate School of Life and Environmental Sciences, Kyoto Prefectural University, Shimogamo, Sakyo-ku, Kyoto, Japan 606-8522;Nova System Co., Kita-ku, Osaka, Japan 530-0005;Division of Environmental Sciences Graduate School of Life and Environmental Sciences, Kyoto Prefectural University, Shimogamo, Sakyo-ku, Kyoto, Japan 606-8522

  • Venue:
  • Artificial Life and Robotics
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We have previously developed a method for the recognition of the facial expression of a speaker. For facial expression recognition, we previously selected three images: (i) just before speaking, (ii) speaking the first vowel, and (iii) speaking the last vowel in an utterance. By using the speech recognition system named Julius, thermal static images are saved at the timed positions of just before speaking, and when just speaking the phonemes of the first and last vowels. To implement our method, we recorded three subjects who spoke 25 Japanese first names which provided all combinations of the first and last vowels. These recordings were used to prepare first the training data and then the test data. Julius sometimes makes a mistake in recognizing the first and/or last vowel (s). For example, /a/ for the first vowel is sometimes misrecognized as /i/. In the training data, we corrected this misrecognition. However, the correction cannot be carried out in the test data. In the implementation of our method, the facial expressions of the three subjects were distinguished with a mean accuracy of 79.8% when they exhibited one of the intentional facial expressions of "angry," "happy," "neutral," "sad," and "surprised." The mean accuracy of the speech recognition of vowels by Julius was 84.1%.