Capturing errors in written Chinese words

  • Authors:
  • Chao-Lin Liu;Kan-Wen Tien;Min-Hua Lai;Yi-Hsuan Chuang;Shih-Hung Wu

  • Affiliations:
  • National Chengchi University;National Chengchi University;National Chengchi University;National Chengchi University;Chaoyang University of Technology, Taiwan

  • Venue:
  • ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

A collection of 3208 reported errors of Chinese words were analyzed. Among which, 7.2% involved rarely used character, and 98.4% were assigned common classifications of their causes by human subjects. In particular, 80% of the errors observed in writings of middle school students were related to the pronunciations and 30% were related to the compositions of words. Experimental results show that using intuitive Web-based statistics helped us capture only about 75% of these errors. In a related task, the Web-based statistics are useful for recommending incorrect characters for composing test items for "incorrect character identification" tests about 93% of the time.