Chinese chunking with tri-training learning

  • Authors:
  • Wenliang Chen;Yujie Zhang;Hitoshi Isahara

  • Affiliations:
  • Computational Linguistics Group, National Institute of Information and Communications Technology, Kyoto, Japan;Computational Linguistics Group, National Institute of Information and Communications Technology, Kyoto, Japan;Computational Linguistics Group, National Institute of Information and Communications Technology, Kyoto, Japan

  • Venue:
  • ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a practical tri-training method for Chinese chunking using a small amount of labeled training data and a much larger pool of unlabeled data. We propose a novel selection method for tri-training learning in which newly labeled sentences are selected by comparing the agreements of three classifiers. In detail, in each iteration, a new sample is selected for a classifier if the other two classifiers agree on the labels while itself disagrees. We compare the proposed tri-training learning approach with co-training learning approach on Upenn Chinese Treebank V4.0(CTB4). The experimental results show that the proposed approach can improve the performance significantly.