Improving dependency parsing with subtrees from auto-parsed data

Authors:
Wenliang Chen;Jun'ichi Kazama;Kiyotaka Uchimoto;Kentaro Torisawa
Affiliations:
National Institute of Information and Communications Technology, Soraku-gun, Kyoto, Japan;National Institute of Information and Communications Technology, Soraku-gun, Kyoto, Japan;National Institute of Information and Communications Technology, Soraku-gun, Kyoto, Japan;National Institute of Information and Communications Technology, Soraku-gun, Kyoto, Japan
Venue:
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Year:
2009

Citing 17
Cited 13

Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
TnT: a statistical part-of-speech tagger

ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Bootstrapping statistical parsers from small datasets

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
A maximum entropy model for prepositional phrase attachment

HLT '94 Proceedings of the workshop on Human Language Technology
Question answering passage retrieval using dependency relations

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Online large-margin training of dependency parsers

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Reranking and self-training for parser adaptation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Multilingual dependency parsing using Bayes Point Machines

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Discriminative classifiers for deterministic dependency parsing

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
CoNLL-X shared task on multilingual dependency parsing

CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
TAG, dynamic programming, and the perceptron for efficient, feature-rich parsing

CoNLL '08 Proceedings of the Twelfth Conference on Computational Natural Language Learning
Parsing syntactic and semantic dependencies with two single-stage maximum entropy models

CoNLL '08 Proceedings of the Twelfth Conference on Computational Natural Language Learning
Coordination disambiguation without any similarities

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Chinese dependency parsing with large scale automatically constructed case structures

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
A tale of two parsers: investigating and combining graph-based and transition-based dependency parsing using beam-search

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Using self-trained bilexical preferences to improve disambiguation accuracy

IWPT '07 Proceedings of the 10th International Conference on Parsing Technologies
Simple training of dependency parsers via structured boosting

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence

Bitext dependency parsing with bilingual subtree constraints

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Improving graph-based dependency parsing with decision history

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Syntactic processing using the generalized perceptron and beam search

Computational Linguistics
Exploiting web-derived selectional preference to improve statistical dependency parsing

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Learning condensed feature representations from large unsupervised data sets for supervised learning

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
SMT helps bitext dependency parsing

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Detecting dependency parse errors with minimal resources

IWPT '11 Proceedings of the 12th International Conference on Parsing Technologies
EXPLOITING SUBTREES IN AUTO-PARSED DATA TO IMPROVE DEPENDENCY PARSING

Computational Intelligence
Utilizing dependency language models for graph-based dependency parsing models

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Exploiting multiple treebanks for parsing with quasi-synchronous grammars

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Semi-supervised dependency parsing using lexical affinities

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
ReliAble dependency arc recognition

Expert Systems with Applications: An International Journal
A feature-based approach to better automatic treebank conversion

Language Resources and Evaluation

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a simple and effective approach to improve dependency parsing by using subtrees from auto-parsed data. First, we use a baseline parser to parse large-scale unannotated data. Then we extract subtrees from dependency parse trees in the auto-parsed data. Finally, we construct new subtree-based features for parsing algorithms. To demonstrate the effectiveness of our proposed approach, we present the experimental results on the English Penn Treebank and the Chinese Penn Treebank. These results show that our approach significantly outperforms baseline systems. And, it achieves the best accuracy for the Chinese data and an accuracy which is competitive with the best known systems for the English data.