Accurate learning for Chinese function tags from minimal features

Authors:
Caixia Yuan;Fuji Ren;Xiaojie Wang
Affiliations:
The University of Tokushima, Tokushima, Japan and Beijing University of Posts and Telecommunications, Beijing, China;The University of Tokushima, Tokushima, Japan and Beijing University of Posts and Telecommunications, Beijing, China;Beijing University of Posts and Telecommunications, Beijing, China
Venue:
ACLstudent '09 Proceedings of the ACL-IJCNLP 2009 Student Research Workshop
Year:
2009

Citing 15
Cited 0

A maximum entropy approach to natural language processing

Computational Linguistics
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Maximum Entropy Markov Models for Information Extraction and Segmentation

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Assigning function tags to parsed text

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Support vector machine learning for interdependent and structured output spaces

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Function tagging

Function tagging
The necessity of parsing for predicate argument recognition

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Chunking with support vector machines

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
The Penn Treebank: annotating predicate argument structure

HLT '94 Proceedings of the workshop on Human Language Technology
The Notion of Argument in Prepositional Phrase Attachment

Computational Linguistics
Enriching the output of a parser using memory-based learning

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
A fast, accurate deterministic parser for Chinese

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Accurate function parsing

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Unsupervised Multilingual Sentence Boundary Detection

Computational Linguistics
Annotating a Japanese text corpus with predicate-argument and coreference relations

LAW '07 Proceedings of the Linguistic Annotation Workshop

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data-driven function tag assignment has been studied for English using Penn Tree-bank data. In this paper, we address the question of whether such method can be applied to other languages and Tree-bank resources. In addition to simply extend previous method from English to Chinese, we also proposed an effective way to recognize function tags directly from lexical information, which is easily scalable for languages that lack sufficient parsing resources or have inherent linguistic challenges for parsing. We investigated a supervised sequence learning method to automatically recognize function tags, which achieves an F-score of 0.938 on gold-standard POS (Part-of-Speech) tagged Chinese text -- a statistically significant improvement over existing Chinese function label assignment systems. Results show that a small number of linguistically motivated lexical features are sufficient to achieve comparable performance to systems using sophisticated parse trees.