Accurate learning for Chinese function tags from minimal features

  • Authors:
  • Caixia Yuan;Fuji Ren;Xiaojie Wang

  • Affiliations:
  • The University of Tokushima, Tokushima, Japan and Beijing University of Posts and Telecommunications, Beijing, China;The University of Tokushima, Tokushima, Japan and Beijing University of Posts and Telecommunications, Beijing, China;Beijing University of Posts and Telecommunications, Beijing, China

  • Venue:
  • ACLstudent '09 Proceedings of the ACL-IJCNLP 2009 Student Research Workshop
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data-driven function tag assignment has been studied for English using Penn Tree-bank data. In this paper, we address the question of whether such method can be applied to other languages and Tree-bank resources. In addition to simply extend previous method from English to Chinese, we also proposed an effective way to recognize function tags directly from lexical information, which is easily scalable for languages that lack sufficient parsing resources or have inherent linguistic challenges for parsing. We investigated a supervised sequence learning method to automatically recognize function tags, which achieves an F-score of 0.938 on gold-standard POS (Part-of-Speech) tagged Chinese text -- a statistically significant improvement over existing Chinese function label assignment systems. Results show that a small number of linguistically motivated lexical features are sufficient to achieve comparable performance to systems using sophisticated parse trees.