Building a Korean web corpus for analyzing learner language

  • Authors:
  • Markus Dickinson;Ross Israel;Sun-Hee Lee

  • Affiliations:
  • Indiana University;Indiana University;Wellesley College

  • Venue:
  • WAC-6 '10 Proceedings of the NAACL HLT 2010 Sixth Web as Corpus Workshop
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Post-positional particles are a significant source of errors for learners of Korean. Following methodology that has proven effective in handling English preposition errors, we are beginning the process of building a machine learner for particle error detection in L2 Korean writing. As a first step, however, we must acquire data, and thus we present a methodology for constructing large-scale corpora of Korean from the Web, exploring the feasibility of building corpora appropriate for a given topic and grammatical construction.