Improving implicit discourse relation recognition through feature set optimization

Authors:
Joonsuk Park;Claire Cardie
Affiliations:
Cornell University, Ithaca, NY;Cornell University, Ithaca, NY
Venue:
SIGDIAL '12 Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Year:
2012

Citing 10
Cited 0

Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
An unsupervised approach to recognizing discourse relations

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Accurate unlexicalized parsing

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Reading comprehension tests for computer-based understanding evaluation

Natural Language Engineering
Recognizing contextual polarity in phrase-level sentiment analysis

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
NLTK: the natural language toolkit

COLING-ACL '06 Proceedings of the COLING/ACL on Interactive presentation sessions
Automatic sense prediction for implicit discourse relations in text

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Recognizing implicit discourse relations in the Penn Discourse Treebank

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Using entity features to classify implicit discourse relations

SIGDIAL '10 Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Predicting discourse connectives for implicit discourse relation recognition

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters

Quantified Score

Hi-index	0.00

Visualization

Abstract

We provide a systematic study of previously proposed features for implicit discourse relation identification, identifying new feature combinations that optimize F1-score. The resulting classifiers achieve the best F1-scores to date for the four top-level discourse relation classes of the Penn Discourse Tree Bank: COMPARISON, CONTINGENCY, EXPANSION, and TEMPORAL. We further identify factors for feature extraction that can have a major impact on performance and determine that some features originally proposed for the task no longer provide performance gains in light of more powerful, recently discovered features. Our results constitute a new set of baselines for future studies of implicit discourse relation identification.