A maximum entropy approach to natural language processing
Computational Linguistics
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Maximum Entropy Markov Models for Information Extraction and Segmentation
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Assigning function tags to parsed text
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Support vector machine learning for interdependent and structured output spaces
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Function tagging
The necessity of parsing for predicate argument recognition
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Chunking with support vector machines
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
The Penn Treebank: annotating predicate argument structure
HLT '94 Proceedings of the workshop on Human Language Technology
The Notion of Argument in Prepositional Phrase Attachment
Computational Linguistics
Enriching the output of a parser using memory-based learning
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
A fast, accurate deterministic parser for Chinese
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Unsupervised Multilingual Sentence Boundary Detection
Computational Linguistics
Annotating a Japanese text corpus with predicate-argument and coreference relations
LAW '07 Proceedings of the Linguistic Annotation Workshop
Hi-index | 0.00 |
Data-driven function tag assignment has been studied for English using Penn Tree-bank data. In this paper, we address the question of whether such method can be applied to other languages and Tree-bank resources. In addition to simply extend previous method from English to Chinese, we also proposed an effective way to recognize function tags directly from lexical information, which is easily scalable for languages that lack sufficient parsing resources or have inherent linguistic challenges for parsing. We investigated a supervised sequence learning method to automatically recognize function tags, which achieves an F-score of 0.938 on gold-standard POS (Part-of-Speech) tagged Chinese text -- a statistically significant improvement over existing Chinese function label assignment systems. Results show that a small number of linguistically motivated lexical features are sufficient to achieve comparable performance to systems using sophisticated parse trees.