Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
The Penn Chinese TreeBank: Phrase structure annotation of a large corpus
Natural Language Engineering
A simple pattern-matching algorithm for recovering empty nodes and their antecedents
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Antecedent recovery: experiments with a trace tagger
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Using linguistic principles to recover empty categories
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Learning accurate, compact, and interpretable tree annotation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Fully parsing the Penn Treebank
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Effects of empty categories on machine translation
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Chasing the ghost: recovering empty categories in the Chinese treebank
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Language-independent parsing with empty elements
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Chinese sentence segmentation as comma classification
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Hi-index | 0.00 |
Empty elements (EEs) play a critical role in Chinese syntactic, semantic and discourse analysis. Previous studies employ a language-independent sentence-level approach to EE recovery, by casting it as a linear tagging or structured parsing problem. In comparison, this paper proposes a clause-level hybrid approach to address specific problems in Chinese EE recovery, which recovers EEs in Chinese language from the clause perspective and integrates the advantages of both linear tagging and structured parsing. In particular, a comma disambiguation method is employed to improve syntactic parsing and help determine clauses in Chinese. In this way, the noise introduced by sentence-level syntactic parsing and multiple EEs in the same position of a linear sentence can be well addressed. Evaluation on Chinese Treebank 6.0 shows the significant performance improvement of our clause-level hybrid approach over the state-of-the-art sentence-level baselines, and its great impact on a state-of-the-art Chinese syntactic parser.