Discovering genres of online discussion threads via text mining

  • Authors:
  • Fu-Ren Lin;Lu-Shih Hsieh;Fu-Tai Chuang

  • Affiliations:
  • Institute of Technology Management, National Tsing Hua University, Section 2, Kuang-Fu Road, Hsinchu 300, Taiwan;Department of Information Management, National Sun Yat-Sen University, 70 Lien-Hai Road, Kaohsiung 804, Taiwan;Department of Earth Sciences, Department of Earth Sciences, National Taiwan Normal University, No. 88, Section 4, Tingzhou Road, Wenshan District, Taipei City 116, Taiwan

  • Venue:
  • Computers & Education
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

As course management systems (CMS) gain popularity in facilitating teaching. A forum is a key component to facilitate the interactions among students and teachers. Content analysis is the most popular way to study a discussion forum. But content analysis is a human labor intensity process; for example, the coding process relies heavily on manual interpretation; and it is time and energy consuming. In an asynchronous virtual learning environment, an instructor needs to keep monitoring the discussion forum from time to time in order to maintain the quality of a discussion forum. However, it is time consuming and difficult for instructors to fulfill this need especially for K12 teachers. This research proposes a genre classification system, called GCS, to facilitate the automatic coding process. We treat the coding process as a document classification task via modern data mining techniques. The genre of a posting can be perceived as an announcement, a question, clarification, interpretation, conflict, assertion, etc. This research examines the coding coherence between GCS and experts' judgment in terms of recall and precision, and discusses how we adjust the parameters of the GCS to improve the coherence. Based on the empirical results, GCS adopts the cascade classification model to achieve the automatic coding process. The empirical evaluation of the classified genres from a repository of postings in an online course on earth science in a senior high school shows that GCS can effectively facilitate the coding process, and the proposed cascade model can deal with the imbalanced distribution nature of discussion postings. These results imply that GCS based on the cascade model can perform as an automatic posting coding system.