Modeling human coding of free response data

  • Authors:
  • Shahram Ghiasinejad;Richard M. Golden

  • Affiliations:
  • Department of Psycology, University of Central Florida, Orlando, FL 32816, United States;School of Behavioral and Brain Sciences, University of Texas at Dallas, United States

  • Venue:
  • Computers in Human Behavior
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Summarization, recall, think-aloud, and question-answering protocol data are examples of free response verbal reports used for the purposes of revealing the structure and content of internal mental representations and processes within the field of discourse processes. Typically, two experienced coders independently semantically annotate a portion of collected protocol data and measures of agreement are used to determine the reliability of the coding. This methodology, however, does not provide an effective method for communicating in an unambiguous manner complex coding procedures to other researchers. To address this problem, an automated methodology called AUTOCODER for coding free response data is evaluated. The AUTOCODER system works by actively interacting with an experienced human coder who semantically annotates key words with ''word-concepts'' and sequences of word-concepts with ''propositions''. After training AUTOCODER on a set of 70 segmented and semantically annotated free response verbal reports originally generated by second grade and fifth grade students, AUTOCODER exhibited a good proposition agreement rate of 91% and a kappa agreement score of 65% with respect to an experienced human coder on an additional set of 24 unsegmented free response verbal reports. Limitations and general implications of these findings are also discussed.