Automatic childhood autism detection by vocalization decomposition with phone-like units

  • Authors:
  • Dongxin Xu;Jeffrey A. Richards;Jill Gilkerson;Umit Yapanel;Sharmistha Gray;John Hansen

  • Affiliations:
  • LENA Foundation, Boulder, CO;LENA Foundation, Boulder, CO;LENA Foundation, Boulder, CO;LENA Foundation, Boulder, CO;LENA Foundation, Boulder, CO;University of Texas at Dallas, Richardson, TX

  • Venue:
  • Proceedings of the 2nd Workshop on Child, Computer and Interaction
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Autism is a major child development disorder with a prevalence of 1/150 in the US [22]. Although early identification is crucial to early intervention, there currently are few efficient screening tools in clinical use. This study reports a fully automatic mechanism for child autism detection/screening using the LENA™ (Language ENvironment Analysis) System, which utilizes speech signal processing technology to analyze and monitor a child's natural language environment and the vocalizations/speech of the child. We previously reported preliminary results in [19] using child vocalization composition information generated automatically by the LENA System employing an adult phone model. In this paper, some extensions have been made, including enlargement of the dataset, introduction of a new child vocalization decomposition with the k-means clusters derived directly from the child vocalizations, and its combination with the previous decomposition. The experiment and comparison consistently shows that the child vocalization composition contains rich discriminant information for autism detection. It also shows that the child vocalization composition features generated with the adult phone-model and the child clusters perform similarly when individually used, and complement each other when combined. The combined feature set significantly reduces the error rate. The relative error reduction is 21.7% at the recording-level and 16.8% at the child-level, achieving detection accuracies of 87.4% for recordings and 90.6% for children at the equal-error-rate points.