A clause-level hybrid approach to Chinese empty element recovery

Authors:
Fang Kong;Guodong Zhou
Affiliations:
School of Computer Science and Technology, Soochow University, China and Department of Computer Science, National University of Singapore, Singapore;School of Computer Science and Technology, Soochow University, China
Venue:
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Year:
2013

Citing 13
Cited 0

Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
The Penn Chinese TreeBank: Phrase structure annotation of a large corpus

Natural Language Engineering
A simple pattern-matching algorithm for recovering empty nodes and their antecedents

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Antecedent recovery: experiments with a trace tagger

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Long-distance dependency resolution in automatically acquired wide-coverage PCFG-based LFG approximations

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Using linguistic principles to recover empty categories

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Learning accurate, compact, and interpretable tree annotation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Fully parsing the Penn Treebank

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Improving nominal SRL in Chinese language with verbal SRL information and automatic predicate recognition

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Effects of empty categories on machine translation

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Chasing the ghost: recovering empty categories in the Chinese treebank

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Language-independent parsing with empty elements

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Chinese sentence segmentation as comma classification

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

Empty elements (EEs) play a critical role in Chinese syntactic, semantic and discourse analysis. Previous studies employ a language-independent sentence-level approach to EE recovery, by casting it as a linear tagging or structured parsing problem. In comparison, this paper proposes a clause-level hybrid approach to address specific problems in Chinese EE recovery, which recovers EEs in Chinese language from the clause perspective and integrates the advantages of both linear tagging and structured parsing. In particular, a comma disambiguation method is employed to improve syntactic parsing and help determine clauses in Chinese. In this way, the noise introduced by sentence-level syntactic parsing and multiple EEs in the same position of a linear sentence can be well addressed. Evaluation on Chinese Treebank 6.0 shows the significant performance improvement of our clause-level hybrid approach over the state-of-the-art sentence-level baselines, and its great impact on a state-of-the-art Chinese syntactic parser.