Aligning predicate argument structures in monolingual comparable texts: a new corpus for a new task

Authors:
Michael Roth;Anette Frank
Affiliations:
Linguistics Heidelberg University Germany;Linguistics Heidelberg University Germany
Venue:
SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Year:
2012

Citing 14
Cited 2

Empirical methods for artificial intelligence

Empirical methods for artificial intelligence
Centering: a framework for modeling the local coherence of discourse

Computational Linguistics
Generating summaries of multiple news articles

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Extracting paraphrases from a parallel corpus

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
The Proposition Bank: An Annotated Corpus of Semantic Roles

Computational Linguistics
Alignment by agreement

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Modeling local coherence: An entity-based approach

Computational Linguistics
Constructing corpora for the development and evaluation of paraphrase systems

Computational Linguistics
Clustering and matching headlines for automatic paraphrase acquisition

ENLG '09 Proceedings of the 12th European Workshop on Natural Language Generation
Incorporating information status into generation ranking

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
The GREC main subject reference generation challenge 2009: overview and evaluation results

UCNLG+Sum '09 Proceedings of the 2009 Workshop on Language Generation and Summarisation
SemEval-2010 task 10: Linking events and their participants in discourse

SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation
Very high accuracy and fast dependency parsing is not a contradiction

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
A high-performance syntactic and semantic dependency parser

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations

Aligning predicates across monolingual comparable texts using graph-based clustering

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Exploring coreference uncertainty of generically extracted event mentions

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

Discourse coherence is an important aspect of natural language that is still understudied in computational linguistics. Our aim is to learn factors that constitute coherent discourse from data, with a focus on how to realize predicate-argument structures (PAS) in a model that exceeds the sentence level. In particular, we aim to study the case of non-realized arguments as a coherence inducing factor. This task can be broken down into two subtasks. The first aligns predicates across comparable texts, admitting partial argument structure correspondence. The resulting alignments and their contexts can then be used for developing a coherence model for argument realization. This paper introduces a large corpus of comparable monolingual texts as a prerequisite for approaching this task, including an evaluation set with manual predicate alignments. We illustrate the potential of this new resource for the empirical investigation of discourse coherence phenomena. Initial experiments on the task of predicting predicate alignments across text pairs show promising results. Our findings establish that manual and automatic predicate alignments across texts are feasible and that our data set holds potential for empirical research into a variety of discourse-related tasks.