Corpus-Based acquisition of support verb constructions for portuguese

Authors:
Britta D. Zeller;Sebastian Padó
Affiliations:
Department of Computational Linguistics, Heidelberg University, Germany;Department of Computational Linguistics, Heidelberg University, Germany
Venue:
PROPOR'12 Proceedings of the 10th international conference on Computational Processing of the Portuguese Language
Year:
2012

Citing 12
Cited 0

A systematic comparison of various statistical alignment models

Computational Linguistics
Discovery of inference rules for question-answering

Natural Language Engineering
Corpus-based method for automatic identification of support verbs for nominalizations

EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
Methods for the qualitative evaluation of lexical association measures

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Paraphrasing with bilingual parallel corpora

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Towards terascale knowledge acquisition

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Collocation extraction based on modifiability statistics

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Detecting complex predicates in Hindi using POS projection across parallel corpora

MWE '06 Proceedings of the Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties
Exploiting translational correspondences for pattern-independent MWE identification

MWE '09 Proceedings of the Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications
Mining complex predicates in Hindi using a parallel Hindi-English corpus

MWE '09 Proceedings of the Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications
Complex predicates annotation in a corpus of Portuguese

LAW IV '10 Proceedings of the Fourth Linguistic Annotation Workshop
Identifying and analyzing Brazilian Portuguese complex predicates

MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a resource-poor approach to automatically acquire Support Verb Constructions (SVCs) for European Portuguese with a two-stage procedure. First, we apply a cross-lingual approach with a bilingual parallel corpus: starting with a Portuguese full verb, we use the translations into another language and the corresponding backtranslations to identify Portuguese verb-noun pairs with the same meaning. Since not all of these are SVCs, the candidates are ranked and filtered in a second, monolingual step based on association statistics. We discuss two parametrisations of our procedure for a high-precision and a high-recall setting. In our experiments, these parametrisations achieve a maximum precision of 91% and a maximum recall of 86%, respectively.