Text chunking by combining hand-crafted rules and memory-based learning

  • Authors:
  • Seong-Bae Park;Byoung-Tak Zhang

  • Affiliations:
  • Seoul National University, Seoul, Korea;Seoul National University, Seoul, Korea

  • Venue:
  • ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes a hybrid of hand-crafted rules and a machine learning method for chunking Korean. In the partially free word-order languages such as Korean and Japanese, a small number of rules dominate the performance due to their well-developed postpositions and endings. Thus, the proposed method is primarily based on the rules, and then the residual errors are corrected by adopting a memory-based machine learning method. Since the memory-based learning is an efficient method to handle exceptions in natural language processing, it is good at checking whether the estimates are exceptional cases of the rules and revising them. An evaluation of the method yields the improvement in F-score over the rules or various machine learning methods alone.